subject message "dbpedia ontology version 3.9" "uHello, I'm a bit puzzled about the current version of the dbpedia ontology. The owl Version of the ontology ( More puzzling is the fact that dbpedia_3.9.owl does not contain all of the classes listed at I found however a third file at In summary there are three different versions of dbpedia.owl which all carry the versionInfo tag 3.8 and only one of them is coherent with the listing at Best regards Johannes Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. uHi Johannes, the version number was indeed not updated for the 3.9 release. The subclasses mentioned were not on the mapping wiki when 3.9 was released. The mapping wiki always shows the current status of DBpedia (since it is evolving over time), which can also be found on DBpedia live. The DBpedia dumps are snapshots of a certain point in time of the mappings wiki and Wikipedia (dumps). Thus the difference. Nothing to worry about. Cheers, Anja On 30 Apr 2014, at 10:50, < > < > wrote:" "Complete set of DBpedia resources?" "uHi all, Luis says: Luis, instance types are extracted from mappings that were handcrafted at This means that if a page does not have an infobox, it will not have a type statement. You can get a list of all DBpedia resources that DBpedia Spotlight cares about by running our indexing code and grabbing the \"Concept URIs\" file. As an alternative, you could just \"bzcat | cut | sort | uniq\" the union of DBpedia download files. But, folks, I think that Luis has a point. On the one hand, by definition, everything is an owl:Thing, so even if we do not include this statement explicitly, any reasoner should be able to infer that. On the other hand, we do not assume that people are using reasoners, so I wonder what is the right thing to do here. A few options that I considered: 1) add documentation explaining that clients should infer \"rdf:type owl:Thing\" for every resource mentioned anywhere in DBpedia 2) provide a new dataset \"obvious_instance_types_en.nt\" with statements like \"?resource rdf:type owl:Thing\" for every DBpedia Resource that we include in our dumps 3) include \"?resource rdf:type owl:Thing\" in instance_types for every resource that does not have another type there Is there another easy way to grab the set of all resources in DBpedia? Cheers, Pablo Hi all, Luis says: First I assumed that the infobox types (instance_types_en.nt.bz2) contains all the tags which spotlight is able to extract from a text. However as I found out later, it does not. For example Pablo" "Importing Wikipedia" "uHi all, for those of you who want to run the DBpedia extraction locally, here are some tips on how to import a Wikipedia dump: 1. configure your mysql server! thats very very important. if you have enough RAM, use it. Find my mysql config below. 2. have two hard-disks. one for the mysql data, one for the wikipedia dumps. 3. use a standalone machine. the Wikipedia import puts a lot of load on harddisk and cpu. I used to use one of our application servers, which already had some load on it, and the import took weeks. 4. defrag your harddisks. can save some time. 5. configure the dbpedia import script. if your're not running a server OS, remove the \"-server\" flag in the java mwdumper call. (ok, this is not a performance tip, just a note for getting it working) On my workstation (intel quadcore 2,66 ghz, 8gb ram, vista 64bit, two 10k harddisks), the Wikipedia import took around a very decent 6 hours. Cheers, Georgi mysql config: key_buffer = 1024M max_allowed_packet = 32M table_cache = 256 sort_buffer_size = 512M net_buffer_length = 8M read_buffer_size = 64M read_rnd_buffer_size = 64M myisam_sort_buffer_size = 512M" "how to import OWL-ontology to repository" "uHello Everybody, I've built my own virtuoso-endpoint and now I want to test it with some data. But I can't find out how to Import an ontology (*.owl). Did anybody know how to do what? Thank you. Cheers, Nico uN.Stieler wrote: Please see: 1. VirtRDFInsert" "DBpedia RDF filenames" "uHi All, Am writing, with a niggling issue with the dbpedia3.5.1 tar ball. I have downloaded and parsed the whole lot using libraptor, and yay it all seems to parse correctly, but I wonder why all of the file names are .nt ? For the ntriples serialisation of RDF is said to be in \"ascii\", and most of the .nt files in the tar ball are not legal ascii, and hence not legal ntriples. Saying that I managed to parse all of the files using a turtle paser, turtle allows for unicode characters to be used. This could be slightly confusing to someone coming to use the dbpedia tar ball for the first time, perhaps all of the file extensions should be .ttl which seems to be the standard file extension for turtle files. Keep up the good work people. Regards, Mischa Mischa Tuffield PhD Email: Homepage - Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW, UK +44 7989 199 786 Registered in England and Wales 535 7233 VAT # 849 0517 11 Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD Hi All, Am writing, with a niggling issue with the dbpedia3.5.1 tar ball. I have downloaded and parsed the whole lot using libraptor, and yay it all seems to parse correctly, but I wonder why all of the file names are .nt ? For the ntriples serialisation of RDF is said to be in \"ascii\", and most of the .nt files in the tar ball are not legal ascii, and hence not legal ntriples. Saying that I managed to parse all of the files using a turtle paser, turtle allows for unicode characters to be used. This could be slightly confusing to someone coming to use the dbpedia tar ball for the first time, perhaps all of the file extensions should be .ttl which seems to be the standard file extension for turtle files. Keep up the good work people. Regards, Mischa Mischa Tuffield PhD Email: Homepage - 9AD" "About editor permissions for mappings.dbpedia" "uDear all! My name is Vadim Onufriev. I'd like to write missing labels for the Russian language in Please, tell me what should I do to get the editor permissions there? uHi Vadim & happy new year you got editor right but in the meantime we change the process of requesting a new account, using this list is not efficient Read the mapping documentation and always ask in case of doubt. Cheers, Dimitris On Fri, Dec 30, 2016 at 3:32 PM, Вадим < > wrote:" "dbpedia 3.8 in Maven repo" "uHello, In change log for release 3.8 you write that you \"plan to deploy the DBpedia framework in the main Maven repository, so users don't have to compile the framework anymore. \" That's fantastic news but I checked the central Maven repository and it's not there yet. When are you planning to deploy to Maven central and will there be some notification on the blog or this mailing list? Thanks, Piotr u+1 :) Nicolas. On Aug 8, 2012, at 11:47 AM, Piotr Jagielski < > wrote: uDoes anybody know anything about it? I can compile the library myself but if this pain can be avoided I would really love to know. Thanks, Piotr On 2012-08-09 22:56, Nicolas Torzec wrote: uLet's say, that we formulated the plan to do it, but it still needs to be implemented. I would assume three to six months (or nine) unless either you or Nicolas want to help speed up the process? Maybe we can start with collecting TODOs ? All the best, Sebastian Am 18.08.2012 23:24, schrieb Piotr Jagielski: uHi Sebastian, Although this task is very useful, I would probably be more useful at making DBpedia works on a distributed platform like Hadoop, for processing full database dumps and incremental updates. Nicolas. On Aug 20, 2012 11:53 PM, \"Sebastian Hellmann\" < > wrote: uAlright, that's good news. So Piotr is volunteering to work on the Maven Central release task while Nicolas is volunteering to work on the Hadoop indexing? Shall we propose some estimated completion dates, or perhaps milestones? Cheers, Pablo On Sat, Sep 15, 2012 at 4:52 PM, Nicolas Torzec < >wrote: uI asked for the access to the wiki so we can start collecting TODOs as suggested by Sebastian but I have not received a response so far. For me the priority of this task is lower at the moment because I have to use locally patched version anyway. Piotr On 2012-09-16 12:14, Pablo N. Mendes wrote: uHi Piotr, AFAIK the wiki is open to all. What happens when you try to create an account at wiki.dbpedia.org? There is also mappings.dbpedia.org, where I can give you edit permissions if you tell me your username. Cheers Pablo On Sep 16, 2012 12:21 PM, \"Piotr Jagielski\" < > wrote: uOK, I noticed the login link at the very bottom of the page and I verified I can edit pages after I registered. Thanks! Piotr On 2012-09-16 12:59, Pablo N. Mendes wrote:" "election results" "uHi I am new to SPARQL but would like to use dbpedia for fetching election results. I've tried the following query but the results come out unormalised. I expected to see: Tony Blair 10724953 William Hague 8357,615 Charles Kennedy 4814321 Can anyone suggest what is going wrong? Thanks George H i I am new to SPARQL but would like to use dbpedia for fetching election results. I've tried the following query George uHi, currently, there is no mapping for Infobox_election yet that the MappingExtractor could use. That is why the data is extracted by the InfoboxExtractor, which uses our initial, now three year old infobox parsing approach and accepts a lot of input without normalizing, cleaning or checking it. This extractor produces relations with the namespace Cheers, Max On Sun, Jul 25, 2010 at 2:07 PM, George Hamilton < > wrote:" "Infobox onversions from wikipedia to dbpedia - some questions" "uHi, Let me give an example: Wikipedia page: Dbpedia Page: Let's look at the wikipedia text and see who this guy was influenced by: In dbpedia, however we learn that the our subject, George Meyer, has ''Mad'' magazine by David Owen. Published in ''the New Yorker'' on March 13, 2000. Owen notes in the article that at the time of its writing, he had known Meyer for nearly 25 years, when they were both students and members of the ''Harvard Lampoon'' writing staff. (en) My first question is where is this coming from? Is this from an older wikipedia page? I could not readily see that in the history tab in wikipedia. Is there a way to map versions between dbpedia and wikipedia? Also in dbpedia we see another different influences uri: with the following objects: * dbpedia:Harvard_Lampoon * dbpedia:Mad_%28magazine%29 * dbpedia:The_New_Yorker Where are these in wikipedia and how do the two influences properties relate? The last question is that I can find entries in dbpedia with only two URI's in a row: for example: but no objectif you go to wikipedia, there is no object either. So this is a triple that is not really a triple? I know the conversion is not trivial, I just want to understand what to make of this and what semantics I can expect to find in dbpedia so any comments would be very appreciated. Thanks much! Marvin" "Open Everything Berlin, Saturday 6th December 2008" "uThought people might be interested to hear about Open Everything Berlin, which will take place on Saturday 6th December 2008. More details at: Warm regards, Jonathan Gray The Open Knowledge Foundation" "DBPedia Server RAM" "uHi, What is amount of RAM on which DBPedia Server is running? Thanks, Yashpal Hi, What is amount of RAM on which DBPedia Server is running? Thanks, Yashpal uDo you mean the server at On Apr 8, 2013 1:18 PM, \"Yashpal Pant\" < > wrote: uI mean: From: Jona Christopher Sahnwaldt < > To: Yashpal Pant < > Cc: dbpedia-discussion < > Sent: Monday, April 8, 2013 6:17 PM Subject: Re: [Dbpedia-discussion] DBPedia Server RAM Do you mean the server at On Apr 8, 2013 1:18 PM, \"Yashpal Pant\" < > wrote: Hi, u0€ *†H†÷  €0€1 0 +" "DBpedia in 2011 - Opening up to the community" "uDear all, DBpedia has become quite a famous project, but there is one thing that is still missing. The Time's Person of the Year 2006: *You* DBpedia has limitless use cases and possibilities for extension: - more languages, more wikis (Wikibooks, Wiktionary), more ontologies, more & better data ! In the last years, it was always my impression that the DBpedia team, although we really worked hard and tried our best, limited, in a way, the advance and improvement of the project. So now in 2011 we will of course continue doing, what we are already doing, but in addition we will hand over some possibilities and access to the DBpedia Community. Some things that already happened: - There is a Wiki for editing the mappings: - You can register at the main DBpedia page update pages such as - A DBpedia Internationalization Committee was founded: Now we will soon go one step further: 1. A new mailing list has been created[1] called dbpedia-developers (it is public now). It is for those of you that really want to or ALREADY work on the scala framework to discuss code and other things. This will go along with the addition of more developers from the community to the Sourceforge project. For this reason we will switch to Mercurial [2] soon (around January 17th) and then give access to new developers. Mercurial is a distributed system and it is much easier to handle branches. Developers can test new extractors and new features in a branch and once it is stable we will merge it into the trunk. 2. We integrated Stack and Semantic Overflow on the site to help people get help: A lot of community members were using it already. 3. There are two more pages I'd like to mention: Committee: 4. There is a mailing list for the Wiktionary conversion; the dump will come out some time this year, I guess: The question we still have is: What other infrastructure shall we - the maintainers - provide for the community, so the project can blossom: Now we have 2 Wikis, 2 Mailing lists, 2 StackOverflows and soon 1 Mercurial Repository the community can access Anything else? Sebastian Hellmann On behalf of the DBpedia Team [1] [2] http://mercurial.selenic.com/ uOn 1/12/11 12:39 PM, Sebastian Hellmann wrote: u" "Lookup not working" "uIs lookup down? When I submit this request: I get back an empty ArrayOfResult. This has been working recently. Just spotted the problem today. I've also posted this: func=detail&aid;=3572718&group;_id=190976&atid;=935521 Rob uHi Rob, There is no class called \"Tom\", therefore the system returns an empty result set. This is the correct behavior. See the \"caution\" notice at Anyway I have restarted the service. Please check if it is ok for you. We are in the process of moving all DBpedia services hosted at the Free University of Berlin. We are trying our best to avoid disruption in service provision, but please be advised that the likelihood of problems will be higher in the next week or so. Cheers, Pablo On Fri, Sep 28, 2012 at 1:25 PM, Rob Nichols < > wrote:" "Spaql endpoint and post method" "uHi, I want to query information using sparql endpoint in a php script. I could only use get method to submit my query. I want to use larger queries to submit using post method. I tried something with curl extension but always got invalid request error. Anyone did this before? Can you help me? here what i tried before: $ch = curl_init(\" curl_setopt($ch, CURLOPT_POST ,1); curl_setopt($ch, CURLOPT_POSTFIELDS ,'default-graph-uri=http%3A%2F%2Fdbpedia.org&should-sponge;=&query;='.urlencode($sparql_query).'&format;=text%2Fxmll&debug;=on'); $sonuc = curl_exec($ch); Thanks in advance. Ahmet uHello, Ahmet YILDIRIM wrote: Assuming $url is the URL you want (you can test it on the command line, include the default graph $headers = array(\"Content-Type: \".$this->contentType); $c = curl_init(); curl_setopt($c, CURLOPT_RETURNTRANSFER, 1); curl_setopt($c, CURLOPT_URL, $url); curl_setopt($c, CURLOPT_HTTPHEADER, $headers); $contents = curl_exec($c); curl_close($c); $contentType can be \"application/sparql-results+xml\", \"application/sparql-results+json\", \"text/rdf+n3\" etc. depending on what you need. The result of your query is in $contents. Kind regards, Jens uHello, Jens Lehmann schrieb: > Hello, > > $headers = array(\"Content-Type: \".$this->contentType); > $c = curl_init(); > curl_setopt($c, CURLOPT_RETURNTRANSFER, 1); > curl_setopt($c, CURLOPT_URL, $url); > curl_setopt($c, CURLOPT_HTTPHEADER, $headers); > $contents = curl_exec($c); > curl_close($c); Here is the corresponding POST code: $c = curl_init(); $headers = array(\"Accept: application/sparql-results+JSON\"); curl_setopt($c, CURLOPT_RETURNTRANSFER, true); curl_setopt($c, CURLOPT_URL, \" curl_setopt($c, CURLOPT_HTTPHEADER, $headers); curl_setopt($c, CURLOPT_POST, true); curl_setopt($c, CURLOPT_POSTFIELDS, \"query=$query\"); $contents = curl_exec($c); echo curl_error($c); curl_close($c); Kind regards, Jens uThank you for answering my question. When I apply the code below: $sparql_query=\" SELECT ?label WHERE { ?label ?a ?b} \"; $c = curl_init(); $headers = array(\"Accept: application/sparql-results+html\"); curl_setopt($c, CURLOPT_RETURNTRANSFER, true); curl_setopt($c, CURLOPT_URL, \" curl_setopt($c, CURLOPT_HTTPHEADER, $headers); curl_setopt($c, CURLOPT_POST, true); curl_setopt($c, CURLOPT_POSTFIELDS, \"query=\".urlencode($sparql_query)); $result = curl_exec($c); echo curl_error($c); curl_close($c); echo $result; I get the results as expected. However, when I change sparql_query line with below: $sparql_query=\"SELECT ?label WHERE { { ?a < } UNION { ?a < . } UNION { ?a < } UNION { ?a < http://dbpedia.org/resource/Capital> . } UNION { < http://dbpedia.org/property/turkey> ?a . } UNION { ?a < http://dbpedia.org/property/capital> . } UNION { < http://dbpedia.org/property/turkey> ?a . } UNION { < http://dbpedia.org/property/capital> ?a . } UNION { ?a < http://dbpedia.org/resource/Capital> . } UNION { ?a < http://dbpedia.org/resource/Turkey> . } UNION { < http://dbpedia.org/resource/Capital> ?a . } UNION { ?a < http://dbpedia.org/property/turkey> . } UNION { < http://dbpedia.org/resource/Capital> ?a . } UNION { < http://dbpedia.org/property/turkey> ?a . } UNION { ?a < http://dbpedia.org/property/capital> . } UNION { ?a < http://dbpedia.org/resource/Turkey> . } UNION { < http://dbpedia.org/property/capital> ?a . } UNION { ?a < http://dbpedia.org/property/turkey> . } UNION { < http://dbpedia.org/property/capital> ?a . } UNION { < http://dbpedia.org/property/turkey> ?a . } . ?a rdfs:label ?label . FILTER (LANG(?label) = 'en')}\"; I get: *ERROR The requested URL could not be retrieved While trying to process the request: POST /sparql HTTP/1.1 Host: dbpedia.org Accept: application/sparql-results+html Content-Length: 3129 Content-Type: application/x-www-form-urlencoded Expect: 100-continue The following error was encountered: * Invalid Request Some aspect of the HTTP Request is invalid. Possible problems: * Missing or unknown request method * Missing URL * Missing HTTP Identifier (HTTP/1.0) * Request is too large * Content-Length missing for POST or PUT requests * Illegal character in hostname; underscores are not allowed Your cache administrator is * Where am i doing wrong? Thanks in advance again. Ahmet On Tue, Mar 31, 2009 at 5:28 PM, Jens Lehmann < > wrote: uHello, Ahmet YILDIRIM wrote: [] Did you try to fix all the issues reported by Virtuoso above (by setting further curl parameters)? Kind regards, Jens uWhat are those parameters? I don't know. when I set the sparql_query length to 633(in utf8 encoding) and ecode it with urlencode and use curl to post it, it works. Longer queries doesn't work. Can the problem be because of another reason rather than query length? On Tue, Mar 31, 2009 at 8:40 PM, Jens Lehmann < > wrote: uHi Ahmet, Taking the sample query and code you provide I have been able to see the error being reported by the PHP client. Having turned on internal debugging on the Virtuoso server hosting DBpedia I do not see the query hitting the server so I assume the problem is in the PHP client. So as Jens suggest you probably should look at the params that can be passed to the PHP curl client based on the possible causes reported to see if the query can be made to run. I shall do some reading myself to see if I can get it to work with my PHP client (which is a Virtuoso Server with PHP runtime hosting support by the way :-)) Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 31 Mar 2009, at 16:24, Ahmet YILDIRIM wrote: uHello, Ahmet YILDIRIM wrote: I just made a couple of tests with the content-length by simply replacing ?a by ?aaain your query. The result is that the POST method works up to a content-length of 1024 (explicitly specifying the content-length in the header did not help) and the GET method works up to a URL length of 4092. Kind regards, Jens uI would expect POST method be larger than GET method. Why isn't it? Anyway is there a way to increase the post size? I want to submit really large queries which take just a few seconds to answer(Not intensive queries.) On Wed, Apr 1, 2009 at 9:25 AM, Jens Lehmann < > wrote:" "failing link to DBpedia resource containing non-ASCI chars" "uHi, I have a local triplestore (VOS7) containing links to DBpedia Dutch (nl.dbpedia.org). Some of the URI's in DBpedia contain non ASCII characters, like which are presented as such in SPARQL results. But when used in a federated query on my local store including Virtuoso RDFZZ Error DB.DBA.SPARQL_REXEC(' ) returned Content-Type 'text/plain' status 'HTTP/1.1 400 Bad Request ' Virtuoso 37000 Error SP030: SPARQL compiler, line 0: Bad character '\' (0x5c) in SPARQL expression at '\' SPARQL query: define sql:big-data-const 0 define output:format \"HTTP+XML application/sparql-results+xml\" define output:dict-format \"HTTP+TTL text/rdf+n3\" SELECT ?abstract WHERE { ?abstract . } SPARQL query: define sql:big-data-const 0 #output-format:text/html define sql:signal-void-variables 1 SELECT * { SERVICE { ?s owl:sameAs ?dbp . } SERVICE { ?dbp dbpo:abstract ?abstract. } } Any advice on how to handle these non-ASCII chars in DBpedia URI's would greatly appreciated. Thanks, Roland uHi Roland, It looks to me like you imported the IRI dumps of the Dutch DBpedia and try to query using URIs. version, encoded in ASCII, of the IRI. In order for your queries to work you need to use IRIs. Cheers, Alexandru Alexandru-Aurelian Todor Freie Universität Berlin Department of Mathematics and Computer Science Institute of Computer Science AG Corporate Semantic Web Königin-Luise-Str. 24/26, room 116 14195 Berlin Germany On 02/10/2014 09:03 PM, Roland Cornelissen wrote:" "Out of memory problems on dbpedia" "uHi - I have this problem with a CONSTRUCT query that I get out of memory errors on and I don't really understand how and I was wondering if some of you could help me understand why. The reason I don't understand it is that the query works up until I add a new triple to the query. What I mean is that it works if you remove the line `dbpedia-owl:abstract ?abstract.` (and replace the previous `;` with `.`). How I see it the query should be able to run and return a result, especially since a more general query does. Perhaps I just don't understand the execution sequence of a sparql-query and that is the problem I am having? Regards, Bjarte Johansen" "Sparql endpoint: Virtuoso 42000 Error SQ200" "uHi, sometimes when I execute queries the sparql endpoint returns this error: Virtuoso 42000 Error SQ200: The memory pool size 80019456 reached the limit 80000000 bytes, try to increase the MaxMemPoolSize ini setting. SPARQL query: define sql:big-data-const 0 #output-format:text/html define sql:signal-void-variables 1 define input:default-graph-uri PREFIX : PREFIX dbp: PREFIX rdf: PREFIX rdfs: PREFIX wgs: PREFIX xsd: SELECT DISTINCT ?label ?sub ?date ?place ?lat ?lon WHERE { ?sub rdf:type dbp:Film; rdfs:label ?label; ?pred_place ?place; dbp:releaseDate ?date; dbp:country ?place. ?place rdf:type ?class ; wgs:lat ?lat ; wgs:long ?lon. ?class rdfs:subClassOf* dbp:PopulatedPlace . FILTER ( ?date >= '1940-01-01'^^xsd:date && ?date <= \"1944-01-01\"^^xsd:date ) FILTER ( ?lat >= \"36.650829\"^^xsd:float && ?lat <= \"47.090542\"^^xsd:float && ?lon <= \"18.51083\"^^xsd:float && ?lon >= \"6.620172\"^^xsd:float ) FILTER ( langMatches(lang(?label), \"EN\")) } ORDER BY ASC(?date) the strange thing is that the error is returned only for specific ontology category. For example \"Film\" in the above query but also \"Book\" and \"Single\". Someone has already fix this problem? Maybe writing the query in a more efficient way? Thanks in advance Fabio Hi, sometimes when I execute queries the sparql endpoint returns this error: Virtuoso 42000 Error SQ200: The memory pool size 80019456 reached the limit 80000000 bytes, try to increase the MaxMemPoolSize ini setting. SPARQL query: define sql:big-data-const 0 #output-format:text/html define sql:signal-void-variables 1 define input:default-graph-uri < PREFIX dbp: < PREFIX rdf: < PREFIX rdfs: < PREFIX wgs: < PREFIX xsd: < SELECT DISTINCT ?label ?sub ?date ?place ?lat ?lon WHERE { ?sub rdf:type dbp:Film; rdfs:label ?label; ?pred_place ?place; dbp:releaseDate ?date; dbp:country ?place. ?place rdf:type ?class ; wgs:lat ?lat ; wgs:long ?lon. ?class rdfs:subClassOf* dbp:PopulatedPlace . FILTER ( ?date >= '1940-01-01'^^xsd:date && ?date <= \"1944-01-01\"^^xsd:date ) FILTER ( ?lat >= \"36.650829\"^^xsd:float && ?lat <= \"47.090542\"^^xsd:float && ?lon <= \"18.51083\"^^xsd:float && ?lon >= \"6.620172\"^^xsd:float ) FILTER ( langMatches(lang(?label), \"EN\")) } ORDER BY ASC(?date) the strange thing is that the error is returned only for specific ontology category. For example \"Film\" in the above query but also \"Book\" and \"Single\". Someone has already fix this problem? Maybe writing the query in a more efficient way? Thanks in advance Fabio" "Person dataset from SPARQL endpoint beyond 10K" "uHello everyone, I am trying to download triples from SPARQL endpoint specific to Person category but hit a limit of 10K triples. I also tried to download the Person dataset that are made publicly available but it seems to have relatively fewer predicates compared to the endpoint. How can I get access to all triples present for DBpedia Person category ? Ankur. Hello everyone,     I am trying to download triples from SPARQL endpoint specific to Person category but hit a limit of 10K triples. I also tried to download the Person dataset that are made publicly available but it seems to have relatively fewer predicates compared to the endpoint. How can I get access to all triples present for DBpedia Person category ? Ankur. uThere are quite a few methods. (1) You can download more RDF files to get a more complete data set (2) To pick out references to people you can first of all scan the types to make a list of people, then you can filter for triples where a member of that list is in either the ?s or ?o position" "Disambiguation links dump" "uHi, which of the DBpedia dumps contains the disambiguation links (i.e. dbo:wikiPageDisambiguates)? The downloads page ( \"Dataset Descriptions\" about \"Disambiguation links\" ( the corresponding download. Are the disambiguation links available separately or as a part of another dump? - Jindřich uHello Jindrich, the download-page has been revamped for the upcoming release, containing all datasets in one table. The old Dl pages will be replaced at least for the 2015-04 release as well. For now please find the disambiguations directly on the server: Best, Markus PS.: not sure why disambiguations are not listed on the old Dl page On Thu, Feb 18, 2016 at 11:48 AM, Jindřich Mynarz < > wrote:" "Apple Disambiguation" "uHi, I'm comparing the 3.3 and 3.4 dumps and in particular looking at the disambiguations file. A good example/test I normally use is the concept 'Apple' and how it can be disambiguated into Apple Computers/Apple Records or the fruit with the same name. In the 3.3 release we saw : . . . . . . . . . . . . . . . . In the 3.4 release there are no disambiguations for I'm unsure if this means we now have a separate disambiguation page as of the 3.4 release but when the last extraction run was done (in 3.3) there wasn't a disambiguation page, or if the disambiguation page has changed in some way that breaks the disambiguation extraction code ? Does anyone know what the problem/cause is here ? Cheers, Rob uYes, there are disambiguation pages. Formerly data looked like this: :Apple a :FloweringPlant :Apple :kingdom :Plant :Apple :disambiguates :Apple_Inc. There were even worse cases. e.g. Braniff_(Disambiguation) was made to Braniff, which was a redirect to Braniff_International_Airways There is no correct algortihmic approach to solve this matter, so the reason, why we changed it, is that it reflects the pages on Wikipedia as they are, without any transformations, that might lead to a strange representation. If Apple is a FloweringPlant, how can it disambiguate Apple Inc.? The concepts have nothing to with each other, except sharing the String identifier. Regards, Sebastian Rob Lee schrieb: uSebastian Hellmann wrote: I can see the logic and agree with it (I think), but I can't see any evidence of the change in the changelog for 3.4 (have I just missed it ?) It's quite a big change to have made (the disambiguation dataset has nearly doubled in size between 3.3 and 3.4) - was there any discussion on the mailing list as to the best implementation (for example, it might be nice to add a reference to the disambiguation page - 'disambiguated by' or similar) ? Is there any benefit in setting up a dbpedia-developers list, in order to discuss these changes to the core data ? Rob uRob Lee schrieb: The disambiguation issue was considered a bug and therefore just fixed. Especially, with the upcoming live extraction, DBpedia clearly needs to reflect Wikipedia on a page level. A dbpedia-developers list is something, which is on our todo list, but it should not be another mailing list. Currently, our developers are busy. One thing is the live extraction, the other thing is the complete refactoring of the extraction framework in java/scala. Code fluctuation is so high, that a developers list would be not effective. However, I understand the concern, you have. There should be more public influence and we are working on that. Part of it comes bundled with the live extraction and also, once we have a steady code base, there should be appropriate places to discuss changes and enhancements properly. Also the vocabulary used should reflect consensus. As a preview, you can have a look here: Regards, Sebastian uSebastian Hellmann wrote: Hmm, I'm not sure I agree on that, I don't think live extraction should be driving the way data is represented in dbpedia. Live extraction should be a technological solution to a problem, not impact the data model ? I'd also question the fact that a 'bug' can just be 'fixed' without any assessment or notification. When fixing a bug, you typically look at the impact of the change to fix it, in this case the impact is large as it drastically changes the model behind how disambiguations work (in fact I 'd even question classifying it as a bug - there was a clear design intent behind how it used to work, which someone in the dbpedia team disagreed with and decided to change, rather than there being an error in the code/implementation). I'm guessing the way it used to work was designed/decided as a result of this feature request/email : the 3.4 release in the bug tracker (I searched for disambiguation). Is this being used by the current development team, is it worth people raising bugs there anymore ? Also, I'm still a little confused by the disambiguation issue, looking at It still appears to have' disambiguates' properties ? I don't know the reasons behind the rewrite (has there been any discussion about it or if it's needed ?) but I always refer people to is touted. I think thats an argument you could level at any open source project at one time or another but other projects do manage to have them, in fact when code fluctuation is high, thats a very good time to have one so that people discuss changes > However, I understand the concern, you have. There should be more public Thanks, I will have a look. Is it mentioned on the main dbpedia site anywhere? I don't recall seeing it before." "Help with Dbpedia Extraction Framework" "uHey everyone, I’m trying to setup the Dbpedia extraction framework as I’m interested in getting structured data from already downloaded wikipedia dumps. As per my understanding I need to work in the ‘dump’ directory of the codebase. I have tried to reverse engineer ( given scala is new for me) but I need some help. 1. First of all, is there a more detailed documentation somewhere about setting and running the pipeline. The one available on dbpedia.org seems insufficient. 2. I understand that I need to create a config.properties file first where I need to setup input/output locations, list of extractors and the languages. I tried working with the config.properties.default given in the code. There seems to be some typo in the extractor list. ‘org.dbpedia.extraction.mappings.InterLanguageLinksExtractorExtractor’ using this gives ‘class not found’ error. I converted it to ‘org.dbpedia.extraction.mappings.InterLanguageLinksExtractor’. Is it ok ? 3. I can’t find the documentation on how to setup the input directory. Can someone tell me the details? From what I gather, input directory should contain a ‘commons’ directory plus, directory for all languages set in config.properties. All these directories must have a subdirectory whose name should be of YYYYMMDD format. Within that you save the xml files such as enwiki-20111111-pages-articles.xml. Am I right ? Does the framework work on any particular dump of Wikipedia? Also what goes in the commons branch ? 4. I ran the framework by copying a sample dump Sorry if the question are too basic and already mentioned somewhere. I have tried looking but couldn’t find myself. Also another question: Is there a reason for the delay in subsequent Dbpedia releases ? I was wondering , if the code is already there, why does it take 6 months between Dbpedia releases? Is there a manual editorial involved or is it due to development/changes in the framework code which are collated in every release? Thanks and regards, Amit Tech Lead Cloud and Platform Group Yahoo! Help with Dbpedia Extraction Framework Hey everyone, I’m trying to setup the Dbpedia extraction framework as I’m interested in getting structured data from already downloaded wikipedia dumps. As per my understanding I need to work in the ‘dump’ directory of the codebase. I have tried to reverse engineer ( given scala is new for me) but I need some help. First of all, is there a more detailed documentation somewhere about setting and running the pipeline. The one available on dbpedia.org seems insufficient. I understand that I need to create a config.properties file first where I need to setup input/output locations, list of extractors and the languages. I tried working with the config.properties.default given in the code. There seems to be some typo in the extractor list. ‘org.dbpedia.extraction.mappings.InterLanguageLinksExtractorExtractor’ using this gives ‘class not found’ error. I converted it to ‘org.dbpedia.extraction.mappings.InterLanguageLinksExtractor’. Is it ok ? I can’t find the documentation on how to setup the input directory. Can someone tell me the details? From what I gather, input directory should contain a ‘commons’ directory plus, directory for all languages set in config.properties. All these directories must have a subdirectory whose name should be of YYYYMMDD format. Within that you save the xml files such as enwiki-20111111-pages-articles.xml. Am I right ? Does the framework work on any particular dump of Wikipedia? Also what goes in the commons branch ? I ran the framework by copying a sample dump Yahoo! uI also have the same queries. I'm working on bengali DBpediaSome body please answeer On Tue, Nov 22, 2011 at 4:33 PM, Amit Kumar < > wrote: uHi Amit, Thanks for your interest in DBpedia. Most of my effort has gone into DBpedia Spotlight, but I can try to help with the DBpedia Extraction Framework as well. Maybe the core developers can chip in if I misrepresent somewhere. 1) [more docs] I am unaware. 2) [typo in config] Seems ok. 3) Am I right ? Does the framework work on any particular dump of Yes. As far as I can tell, you're right. But there is no particular dump. You just need to follow the convention for the directory structure. The commons directory has a similar structure, see: wikipediaDump/commons/20110729/commonswiki-20110729-pages-articles.xml I think this file is only used by the image extractor and maybe a couple of others. Maybe it should be only mandatory if the corresponding extractors are included in the config. But it's likely nobody got around to implementing that catch yet. 4) It seems the AbstractExtractor requires an instance of Mediawiki running The abstract extractor is used to render inline templates, as many articles start with automatically generated content from templates. See: Also another question: Is there a reason for the delay in subsequent One reason might be that a lot of the value in DBpedia comes from manually generated \"homogenization\" in mappings.dbpedia.org. That, plus getting a stable version of the framework tested and run would probably explain the choice of periodicity. Best, Pablo On Tue, Nov 22, 2011 at 12:03 PM, Amit Kumar < > wrote: uHi Pablo, Thanks for your valuable input. I got the Mediawiki think working and am able to run the abstract extractor as well. The extraction framework works well for a small sample dataset e.g has around 6300 entries. But when I try to run the framework on the full wikipedia data(en, around 33GB uncompressed) I get java heap space errors. uHi Amit, but it seems that the mvn command spawn another processes and fails to pass on the flags to the new one. If someone has been able to run the framework, could you please share me the details.\" The easiest way to get it working is probably to change the value in the dump/pom.xml here: Extract org.dbpedia.extraction.dump.Extract -Xmx1024m Cheers, Pablo On Thu, Dec 1, 2011 at 8:01 AM, Amit Kumar < > wrote: uHi Pablo, I figured this out just after sending my email. I’m experimenting with some values right now. I’ll let you know if I get it to work. In the meanwhile, if some one already has the working values, it would be a big help. Plus do you know anyone running the DEF on Hadoop ? Thanks Amit On 12/1/11 4:39 PM, \"Pablo Mendes\" < > wrote: Hi Amit, The easiest way to get it working is probably to change the value in the dump/pom.xml here: Extract org.dbpedia.extraction.dump.Extract -Xmx1024m Cheers, Pablo On Thu, Dec 1, 2011 at 8:01 AM, Amit Kumar < > wrote: Hi Pablo, Thanks for your valuable input. I got the Mediawiki think working and am able to run the abstract extractor as well. The extraction framework works well for a small sample dataset e.g has around 6300 entries. But when I try to run the framework on the full wikipedia data(en, around 33GB uncompressed) I get java heap space errors. uHi Amit, I don't know the minimal heap configurations for the DEF. I snooped around Max's machine, and found 1024M in his pom.xml. If he changed, it is somewhere I couldn't find. Last summer I started concocting a hadoop run of the framework, but had to switch my attention somewhere else, and haven't had time to go back since. I do not know of anybody who has done it. Best, Pablo On Thu, Dec 1, 2011 at 12:17 PM, Amit Kumar < > wrote: uWhen using mvn scala:run, use MAVEN_OPTS=-Xmx rather than JAVA_OPTS The dump also comes in 27 files rather than one big one. You can use these alternatively. uHi Tommy, I knew about the MAVEN_OPTS. I tried that but as I mentioned, the flags are not being passed on to the child process being spawned. Turns out its Hardcoded in the pom.xml of dump directory. I too was thinking of using partial wikipedia files as input. The problem is, The input and output mechanism is sort of hardcoded. It expects a single file per langauge e.g input/en/20111107/enwiki-20111107-pages-articles.xml . Now I have two options. If I don't want to make any changes in the code, I could run the framework multiple times, once for each partial file, but then the outputs would in different folders for each file. Or work on running the framework in a way that it picks all the files in a folder and also collate the outputs in a single place. But this would entail changes in code. Is there a simple way in the Dbpedia Extraction Framework itself to pick multiple files in one directory and collate the results. I can't seem to find it. As per my understanding I would need to change the ConfigLoader class. Have you either of this ? Thanks and Regards, Amit On 12/1/11 11:22 PM, \"Tommy Chheng\" < > wrote: uTommy, Amit, Well, since the pom.xml is a configuration file, I'd call that \"configurable through the pom.xml\" rather than \"hardcoded\". Or maybe we *should* call it hardcoded so that the maven-scala-plugin guys get alarmed. :) I too had problems getting maven and the scala plugin to take my Xmx parameter. The potential solutions I found were: MAVEN_OPTS (used by the maven process), JAVA_OPTS (used by scala) ( worked for me, and apparently that is a common problem with maven plugins: The only thing that worked for me with the maven-scala-plugin was its own jvmArgs: then the outputs would in different folders for each file. Would this be a problem? They are all NT files, so you can just use \"cat\" to put them back together in a simple bash script, no? a folder and also collate the outputs in a single place Maybe you should file a feature request for this. Best, Pablo On Fri, Dec 2, 2011 at 5:51 AM, Amit Kumar < > wrote: uWikipedia wrote: uHi all, a bug in TableMapping caused these memory problems: For now I just commented them out, I'll try to actually fix them later. Regards, Christopher On Fri, Dec 30, 2011 at 17:20, Max Jakob < > wrote:" "Brackets in Property Uris" "uHi, While running the latest version of the extraction framework on the German data dump, I got some property names that have round brackets in them \"(\" \")\" like for example: \" Pubby crashes when a page is requested that contains such a property. The crash is due to Jena, which executes a remote sparql query on Virtuoso and recieves invalid XML as a response. The problem is that I don't even know where to fix the bug, The URL RFC [1] Section 2.2 states that round brackets can be used without escaping them, the URI RFC [2] section 2.4.3 also doesn't mention them being dissalowed so the extracted URIs should be valid. However I don't know if the RDF spec allows property names to contain round brackets . Is the extracted data invalid, or is there a rdf-spec problem ? Here is an example of an invalid RDF/XML file with the offending property URI: xml version='1.0' encoding='%SOUP-ENCODING%' The error looks like this: com.hp.hpl.jena.shared.JenaException: org.xml.sax.SAXParseException: Element type \"n0pred:Austragungsort\" must be followed by either attribute specifications, \">\" or \"/>\". com.hp.hpl.jena.rdf.model.impl.RDFDefaultErrorHandler.fatalError(RDFDefaultErrorHandler.java:45) com.hp.hpl.jena.rdf.arp.impl.ARPSaxErrorHandler.fatalError(ARPSaxErrorHandler.java:35) com.hp.hpl.jena.rdf.arp.impl.XMLHandler.warning(XMLHandler.java:225) com.hp.hpl.jena.rdf.arp.impl.XMLHandler.fatalError(XMLHandler.java:255) org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source) org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source) org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source) org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) org.apache.xerces.parsers.XMLParser.parse(Unknown Source) org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) com.hp.hpl.jena.rdf.arp.impl.RDFXMLParser.parse(RDFXMLParser.java:142) com.hp.hpl.jena.rdf.arp.JenaReader.read(JenaReader.java:158) com.hp.hpl.jena.rdf.arp.JenaReader.read(JenaReader.java:145) com.hp.hpl.jena.rdf.arp.JenaReader.read(JenaReader.java:215) com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:197) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execModel(QueryEngineHTTP.java:161) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execDescribe(QueryEngineHTTP.java:154) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execDescribe(QueryEngineHTTP.java:152) de.fuberlin.wiwiss.pubby.RemoteSPARQLDataSource.execDescribeQuery(RemoteSPARQLDataSource.java:74) de.fuberlin.wiwiss.pubby.RemoteSPARQLDataSource.getResourceDescription(RemoteSPARQLDataSource.java:52) de.fuberlin.wiwiss.pubby.servlets.BaseServlet.getResourceDescription(BaseServlet.java:62) de.fuberlin.wiwiss.pubby.servlets.PageURLServlet.doGet(PageURLServlet.java:38) de.fuberlin.wiwiss.pubby.servlets.BaseURLServlet.doGet(BaseURLServlet.java:33) de.fuberlin.wiwiss.pubby.servlets.BaseServlet.doGet(BaseServlet.java:89) javax.servlet.http.HttpServlet.service(HttpServlet.java:617) javax.servlet.http.HttpServlet.service(HttpServlet.java:717) Kind Regards, Alexandru Todor [1] [2] rfc2396.txt uOn 10/08/11 03:01, Alexandru Todor wrote: The RDF spec does allow that, but the problem is that there is no way to serialize such property URIs in RDF/XML. Specifically the fact that your property URI ends with a closing bracket is a problem. It's a known issue that there are valid RDF graphs that can not be represented in RDF/XML (which is one of many good reasons not to use the RDF/XML syntax format). The extracted data is invalid XML, yes: an XML element QName can not contain brackets. Unfortunately, there is no right way to do this in your case. The RDF/XML spec recommends that a writer tries to split the URIref after the last non-NCName character and use an ad-hoc namespace declaration, but if the last character of the URIref is a non-NCName char (such as the closing bracket in your property URI), there is no way to split it, and a writer tool should report an error (apparently Virtuoso has opted for not giving an error but producing invalid XML instead - either way the communication breaks down). The only reliable way around the problem is to use a serialization format that does cope with all legal RDF properly, such as N-Triples or Turtle. Cheers, Jeen" "loading dbpedia 3.7 files into local repository" "uHi everybody I wanted to create my own local repository for English version of dbpedia 3.7 and downloaded all files from (I am using the latest sesame and owlim to do this) It seems that mappingbased_properties_en.nt file has many problems, and the main one is that non-URI strings are between \"<\" and \">\" making the parser throw an exception as things such as are not a valid URI. Other examples are many URLs that do not start with between \"<\" and \">\" such as in < I wonder what is the best way to solve this problem. I was optimistic at the beginning and started doing 'replace all' to correct some of the URIs but then it turns out that the problem can not really be reduced to a bunch of patterns easily. for example, there are 'links' to webpages which do not even start with 'www' but they are clearly a URL (e.g. < ayat-algormezi.blogspot.com>). One common exception is this: Not a valid (absolute) URI: ayat-algormezi.blogspot.com [line 83864] org.openrdf.rio.RDFParseException: Not a valid (absolute) URI: ayat-algormezi.blogspot.com [line 83864] at org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:566) at org.openrdf.rio.ntriples.NTriplesParser.reportFatalError(NTriplesParser.java:547) at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:295) at org.openrdf.rio.ntriples.NTriplesParser.createURI(NTriplesParser.java:478) at org.openrdf.rio.ntriples.NTriplesParser.parseObject(NTriplesParser.java:326) at org.openrdf.rio.ntriples.NTriplesParser.parseTriple(NTriplesParser.java:246) at org.openrdf.rio.ntriples.NTriplesParser.parse(NTriplesParser.java:170) at org.openrdf.rio.ntriples.NTriplesParser.parse(NTriplesParser.java:112) at org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(RepositoryConnectionBase.java:406) at org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:297) at org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:228) at org.researchsem.RepositoryLoader.loadFiles(RepositoryLoader.java:190) at org.researchsem.RepositoryLoader.loadDir(RepositoryLoader.java:126) at org.researchsem.RepositoryUtils.init(RepositoryUtils.java:199) at org.researchsem.InitRepository.main(InitRepository.java:38) Any thought or help much appreciated. u05.12.2011 16:58, Danica Damljanovic: I had similar problem with the Polish DBpedia - the problem is in the source data, i.e. - some of the links are invalid in Wikipedia. I don't have any - replace_it_all solution, but you should file a bug for DBpedia extractor demanding to check if the extracted URIs are valid and not passing them to the output, if they are not. Regarding a short-cut solution to your problem - I would pass the invalid input data via a script in Python or Ruby and leave only these entries which have valid URIs. Cheers, Aleksander uThanks Alexander I think it would really be great if there would be an additional step to the 'extraction' framework which would basically remove invalid triples. There will always be some errors of this kind in Wikipedia and the only way I see it solved in Dbpedia is to check each triple and then publish only the valid entries. Cheers Danica On 5 December 2011 16:16, Aleksander Pohl < > wrote: uCould this list of incorrect triples feed a data curation process in Wikipedia itself ? Envoyé de mon iPad Le 5 déc. 2011 à 17:49, Danica Damljanovic < > a écrit :" "Corrupted Keys in DBpedia" "uWhat do people on this list think of the discussion here I think everybody involved in this discussion agrees that DBpedia should stop percent encoding URIRefs. Is there any plan to do that?" "DBpedia Lookup Install" "uHi all, I'm trying to install DBpedia Lookup on my machine, but when launching the command mvn clean install, I've this error: [ERROR] Failed to execute goal on project dbpedia-lookup: Could not resolve dependencies for project org.dbpedia.lookup:dbpedia-lookup:jar:3.1: Could not find artifact org.dbpedia.extraction:core:jar:3.8 in nxparser-repo ( I've found the same error in this dicussion, at this link: Any ideas on what is wrong? I think it could probably be due to maven dependencies on pom.xml, but I can't find anywhere that jar. Thanks. Best, Francesco Hi all, I'm trying to install DBpedia Lookup on my machine, but when launching the command mvn clean install, I've this error: [ERROR] Failed to execute goal on project dbpedia-lookup: Could not resolve dependencies for project org.dbpedia.lookup:dbpedia-lookup:jar:3.1: Could not find artifact org.dbpedia.extraction:core:jar:3.8 in nxparser-repo ( Francesco uHi Francesco, What you can do is build the DBpedia 3.8 sources locally to create the dependencies and try again Cheers, Dimtiris On Thu, Jan 14, 2016 at 12:59 PM, Francesco Marchitelli < uHi Dimitris, thanks for your quickly reply. I've done what you've suggested me with these commands, referring to this guide on GitHub git clone git://github.com/dbpedia/extraction-framework.git cd extraction-framework git checkout DBpedia_3.8 mvn clean install But I've this error: [ERROR] Failed to execute goal org.scala-tools:maven-scala-plugin:2.15.2:compile (process-resources) on project core: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 1(Exit value: 1) In the stacktrace, I've found this message before: [ERROR] scalac error: /usr/local/data/datasets/dbpedia/2015-04/core/extraction-framework/core/target/classes does not exist or is not a directory Thanks, Francesco 2016-01-14 12:11 GMT+01:00 Dimitris Kontokostas < >: uthis looks like a local dir permission error, did you run anything as root (sudo) and then as normal user? On Thu, Jan 14, 2016 at 2:00 PM, Francesco Marchitelli < > wrote: uYou're right, making all commands as su, it installed all the dependencies. When now I'm trying to execute the mvn clean install in the lookup folder, I have this error: Run starting. Expected test count is: 13 DiscoverySuite: IntegrationTest: gen 15, 2016 11:28:20 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate INFORMAZIONI: Initiating Jersey application, version 'Jersey: 1.5 01/14/2011 12:36 PM' RUN ABORTED java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:463) at sun.nio.ch.Net.bind(Net.java:455) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at sun.net.httpserver.ServerImpl. (ServerImpl.java:100) at sun.net.httpserver.HttpServerImpl. (HttpServerImpl.java:50) at sun.net.httpserver.DefaultHttpServerProvider.createHttpServer(DefaultHttpServerProvider.java:35) at com.sun.net.httpserver.HttpServer.create(HttpServer.java:129) at com.sun.jersey.api.container.httpserver.HttpServerFactory.create(HttpServerFactory.java:304) [INFO] u\" java.net.BindException: Address already in use\" On Fri, Jan 15, 2016 at 12:44 PM, Francesco Marchitelli < uYes, I've seen it. Showing the IP in use, the 1111 port is in use, which is the port used by Virtuoso. I've tried to kill that PID and reboot the machine, without success. process is trying to open. Do you know something about this? Thanks. 2016-01-15 12:23 GMT+01:00 Dimitris Kontokostas < >:" "World’s Biggest Computer Conference WORLDCOMP is Cancelled" "uThe world’s biggest computer science conference WORLDCOMP is canceled after twelve successful years of service. Defamation campaign is going on WORLDCOMP: Google using the words worldcomp, fake or worldcomp, bogus. We filed lawsuits in 2012 to stop this defamation but no use. Due to this, we are finding it highly difficult to get paper submissions, sponsors, etc. We are determined to solve this problem permanently through legal channels. We are focusing all our efforts on legal matters and that’s why we canceled WORLDCOMP’13. We update WORLDCOMP’s website also, once we complete some legal/technical formalities. As ordered by the University of Georgia, Prof.Hamid Arabnia ( Sincerely, AMG Solo (Ashu Solo) (WORLDCOMP publicity chair since 2005) My interesting activity is: christmas" "OPEN POSITION: Move to Berlin, work on DBpedia (1 year full-time contract)" "uHi all, DBpedia [1] is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia knowledge [2]. DBpedia also plays a central role as an interlinking hub in the emerging Web of Data [3]. The DBpedia Team at Freie Universität Berlin [4] is looking for a developer/researcher who wants to contribute to the further development of the DBpedia information extraction framework, investigate approaches to annotate free-text with DBpedia URIs and participate in the various Linked Data efforts currently advanced by our team. Candidates should have + good programming skills in Java, in addition Scala and PHP are helpful. + a university degree preferably in computer science or information systems. Previous knowledge of Semantic Web Technologies (RDF, SPARQL, Linked Data) and experience with information extraction and/or named entity recognition techniques are a plus. Contract start date: 15 May 2010 Duration: 1 year Salary: around 40.000 Euro/year (German BAT IIa) You will be part of an innovative and cordial team and enjoy flexible work hours. After the year, chances are high that you will be able to choose between longer-term positions at Freie Universität Berlin and at neofonie GmbH. Please contact Chris Bizer via email ( ) until 15 April 2010 for additional details and include information about your skills and experience. The whole DBpedia team is very thankful to neofonie GmbH [5] for contributing to the development of DBpedia by financing this position. neofonie is a Berlin-based company offering leading technologies in the area of Web search, social media and mobile applications. Cheers, Chris [1] [2] [3] [4] [5]" ""property is mapped but not found in the template definition"" "uHello I am trying to understand the mapping process What does it mean for a property to be mapped, but not found in the (Wikipedia) template definition? In what cases would that mapped, yet unfound property have been added to the ontology? Should it be removed, since its not in the template? Thanks Johnathan HelloI am trying to understand the mapping processWhat does it mean for a property to be mapped, but not found in the (Wikipedia) template definition? In what cases would that mapped, yet unfound property have been added to the ontology? Should it be removed, since its not in the template? Thanks Johnathan uHi Johnathan, can you please show us an example? I don't know if you are referring to template properties which are mapped although they are not used in the template nor shown in the template definition. Cheers Andrea 2013/10/11 Johnathan James < > u(I replied previously, but I don't think it went to the mailing listapologies for any faux pas) So, I started at the How To Edit DBPedia Mappings page ( i clicked on 'Infobox Mappings' and then 'Australia state or territories' I then get to the DBPedia page which details the mapping for 'Australia state or territories'There is a link to 'Check which properties are not mapped yet.', and when I click on it I get a page indicating many properties are mapped, but are not in the template definition. My question, is why would a property that is not in the template be mapped? Thanks! On Sat, Oct 12, 2013 at 11:55 AM, Andrea Di Menna < > wrote: uHello? Is this thing on? Can anyone please answer, \"What does it mean for a property to be mapped, but not found in the (Wikipedia) template definition?\" I would be glad to RTFM, too, if you could point me to itThanks! On Mon, Oct 14, 2013 at 11:20 AM, Johnathan James < > wrote: uHi Jonathan On 10/14/2013 08:20 PM, Johnathan James wrote: Probably the template definition changed recently and the mapping is not adapted to the changes. Another reason maybe that the template instances in Wikipedia use undefined properties (in the template). There are many cases where people do spelling mistakes on the property names in Wikipedia resulting in no mapping at all, thus, mapping these common mistakes is also an option. Cheers, Dimitis" "Feedback for new website and communication grouo" "uDear all, we tried quite a lot in the past few weeks but the website is irreparable. Luckily there has been a communication group forming, who is mainly concerned with \"spreading the word about DBpedia\" consisting of Martin Kaltenböck, Gerard Kuys , Lieke Verhelst and Michele Barbera. They have started to develop a new website with a web designer contribution of Semantic Web Company. The new site is here: We are migrating ASAP any help and feedback is welcome. Please do so in the following way: 1. Major issues should go to the DBpedia discussion or developers list, e.g. by replying to this email and keeping the communication group in cc ( ) We will try to resolve all good suggestion immediately. 2. If you would like an editor account, please email Nilesh Chakraborty < > notethat this site will stay a Wiki and we are welcoming contributions and a fix-it-yourself DBpedia attitude. 3. any other issues can go to the communication group directly: (you can also sign up here: Thank you for your help, Sebastian uI like the new look, except the boxes that come before links and often break to a new line. But what's really important is to preserve all useful old content. Especially the very excellent page So is some content migration planned? How can one contribute: - should I use my account from the mapping wiki, self-register, or ask here? - when'd be the new site open for contribution? Cheers! uHi Vladimir, I sent you login details from skype. For now we keep registration manual until the website gets in a good shape. We plan to migrate all content but we start with the important pages first. Anyone interested in an account? On Mon, Apr 27, 2015 at 12:36 PM, Vladimir Alexiev < > wrote: uHi Vladimir, we just changed the site structure to resemble the old one. - Use Case & Projects are and (this is for end users and decision makers) - Services & Resources is for technical users and developers After migrating the b asic content, we are looking for people who take on the responsibility for certain pages. Are you ok with the current structure? For the Ontology, should we keep only one page up to date or should we have a history, e.g. All the best, Sebastian On 27.04.2015 12:36, Vladimir Alexiev wrote:" "People in dbpedia" "uHi, I've been looking at the people in dbpedia and have some questions about how resources are/aren't assigned particular types, and how data is generated for the data dumps & for the dbpedia live instance. I've highlighted the questions separate from my investigations. While there are a lot of different sub-classes for \"person\", there seem to be three primary classes used to describe people: * foaf:Person * dbpedia-owl:Person * yago:Person100007846 Firstly, I can see that dbpedia-owl:Person is defined to be an equivalent class to foaf:Person so I had naively expected every foaf:Person to also be a dbpedia-owl:Person. But this doesn't seem to be the case. For example: In both cases, these resources have types of yago:Person100007846 & foaf:Person but not dbpedia-owl:Person. Q1: how can someone people a foaf:Person but not a dbpedia-owl:Person? Secondly, there are people who have only the type yago:Person100007846: Q2: how can someone be a yago:Person100007846 but not have the other two types? Trying to dig a little deeper I looked at the wikipedia page for Claus Sluter. It appears that the page uses the Persondata template, which includes a date of death: I compared that to another artist, Louis Royer which also uses the Persondata template: That resource *is* defined as a foaf:Person. So I'm confused why both of these resources don't have a type of foaf:Person. The only difference I can see is that there's also a birthdate for that person. Q3: is the mapping to specific types sensitive to data in the templates? Any help would be greatly appreciated. Several times recently I've struggled to reliably query for people in dbpedia and it'd be great to get some of the inconsistencies identified/explained. Cheers, L. uOn 10/14/11 9:33 AM, Leigh Dodds wrote: DBpedia Extraction Team: Isn't anyone going to respond to these questions? If you don't have an answer, then at least say so. Simply ignoring the question is utterly unacceptable. I've deliberately opted to hold back my responses to the question posed. uOn 14 October 2011 14:33, Leigh Dodds < > wrote: IIRC, foaf:Person is extracted via a specific Pnd extractor, while dbpedia-owl:Person is extracted via template mappings. The Persondata template in that article is a recent addition; I presume the yago type comes via category mappings (which would have been the only relevant data at the date of the dump used) - there is no mention in the code of yago, that I recall. It helps to also check the history: It can be: there can be conditional mappings, for example. uOn 16 October 2011 19:23, Kingsley Idehen < > wrote: I assume the kidehen listed on the sourceforge page is you, so doesn't 'Team' include you? I think you're being a little harsh, considering the question was asked on Friday and it's now Sunday. It's not unfair to expect the answer to such a question to not come before Monday, IMO. I hope you see the irony in trying to protest a lack of helpfulness by deliberately withholding help. uOn 10/16/11 3:08 PM, Jimmy O'Regan wrote: Re. DBpedia team, yes. Re. actual DBpedia Extraction part of the DBpedia project, not necessarily :-) In this case, I do have ideas about the solution, but I am increasingly weary of providing instant responses since they are most of the time misunderstood and misinterpreted, to get right to the point. We should acknowledge questions like this. I raised this because of the negative connotation (IMHO) that was already taking shape i.e., a question has been posed and for whatever reason it is being ignored. Zero irony. I would like other voices to participate in the process. Also, as stated earlier, DBpedia extraction from Wikipedia isn't an area of prime oversight by myself or anyone else at OpenLink. I hope you have clearer context for my comments now? uthis is right, unless there is a mapping defined with foaf:Person instead of dbpedia-owl:person And I think it is time to remove the PersondataExtractor from the frameworkIt is completely redundant, since these kind of infoboxes should be extracted by the Mapping extractor Yago definitions come as external sources and they may not match the data extracted from the DBpedia framework Cheers, Dimitris uOn 16 October 2011 20:18, Kingsley Idehen < > wrote: I think that broadening the community is what's needed to have a situation where support can be offered seven days a week, and I'm pretty sure (judging by the questions I was asked about our[1] experiences) that there are efforts underway to try to encourage more contributions (thus extending the community). Google Code-In was recently announced, which might be an opportunity for this. I can provide more details, if anyone is interested. [1] Apertium, a project where I actually am a member of the team :) uOn 10/16/11 4:41 PM, Jimmy O'Regan wrote: Jimmy, Yes, but understand DBpedia, the project, isn't a monolith. In retrospect, we (the main project leaders) could have spelled this out years ago. Here is the breakdown re. prime responsibilities, within the overall project. Freie University uOn 10/16/11 4:05 PM, Dimitris Kontokostas wrote: uOn 17 October 2011 00:42, Kingsley Idehen < > wrote: I had written something in reply to the rest of your mail, but I remembered GCI, and thought it more important to mention that. Perhaps instead of falling into the bad habit of personalising my original statement, I should have phrased it as exactly as possible. I'll try again: I think that to say that anyone is *ignoring* the question is a harsh characterisation, particularly as it came on a Friday afternoon. The more charitable view would be that people simply took the weekend off, which seems much more likely to me. Still, it's nice to see someone paying attention, in case a question goes unanswered. uHi Jimmy, Dimitris, Thanks for the follow-ups that helps me understand the process a little more. On 16 October 2011 21:05, Dimitris Kontokostas < > wrote: That's interesting, I'd overlooked the fact that the \"Yago Links\" also include type information. Cheers, L." "Movie review using dbpedia, questions" "uHello, I've already talked about my project \"Movie review using dbpedia\" [1] a while ago. I said that I'll release code ASAP. So now, that it's the summer hollidays, I have more time to code and maybe release some code. But I am not sure on some ponit - Howto implement a \"fulltext\" search on dbpedia, using sparql? I'd like to keep the app as simple as possible, so I am not in favor of things ala virtuoso etc. - Regarding the UIdoes a simple search box (searching in title property) as a starting point sounds good to you? - OR maybe a text box where to enter a wikipedia URI So, the last item on the list gave me an idea : why not develop a general review app for wikipedia? I can learn to and write a greasemonkey which add a link \"Review this thing\" on wikipedia. But maybe I should keep it small and focus only on filmsand find an elegant solution for the user to select the film he want to review. In the middle of these two solution : provide a simple input search box which search for (and find how to make it at least case insensitive) and a greasemonkey script Sorry for the \"brain-dump\" mail but I like to share my thought with people to have their view on the subject. So, please share your thought/suggestion/idea/whatever and let me know what you think about these solutions/idea. Thanks in advance, [1] Movies_review_using_dbpedia uSimon Rozet wrote: uHi Simon, Did you notice Tom Heath's Revyu project which works into a similar direction than you. Maybe you can somehow join forces? The DBpedia SPARQL endpoint gives you full-text search. Yes. I love search boxes. See also Georgie's Dbpedia Search interface. String search in title and maybe abstract is better. Yes, great idea. Again: See Tom's review anything site. Maybe a way to distingish you from Tom. But you should share generated review data anyway. Cheers Chris" "More on DBpedia and YAGO?" "uHi Chris, I was greatly intrigued by Ivan's update on WWW2007 where he states: \"The paper on Yago from Fabian Suchanek et al was pretty interesting Fabian said they would combine this somehow with dbpedia My understanding is that Fabian and Chris found some ways of binding Yago to RDF during the conference\" Are you able to comment further on this? Timeline? Approach? Etc. This combination has struck me as potentially huge for some time. :) Thanks, Mike uMike, DBpedia and Yago binding occurred prior to the Linked Data Session etc The only problem was that live demos where kinda difficult due to the combination of WiFi and (in my case) projector issues re. Macs etc The updated DBpedia data sets (which Yago enhancements) are now live :-) Georgi: Please elaborate for Mike Kingsley On 5/13/07 11:25 AM, \"Michael K. Bergman\" < > wrote:" "DBpedia Service down due to maintenance work" "uHi All, The DBpedia service will be down due to scheduled building maintenance work on Sunday 21st September 2008 from around 7AM EST to 5PM EST (possibly sooner). Please accept our apologies for any inconveniences this may cause. An announcement will be made when the service is back online Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: support" "A local DBPedia live" "uHi, As you can read it in the subject I plan to create a local live of DBpedia. By the way I know nothing the process. @Dimitris : you told me in a previous mail that you could help me to create it, is-it still right ? I need just some explanation about at least how to run it, and some explanation about these parameter in the \"live.default.ini\" : - every parameters for the OAI configuration - every parameters for the statistics configuration - every parameters for the Input configuration - every parameters for the \"OPTIONS FOR PUBLISHING UPDATES\" Another last question, is possible to make an update just once per day or every 3 days ? Best. Julien. Hi, As you can read it in the subject I plan to create a local live of DBpedia. By the way I know nothing the process. @Dimitris : you told me in a previous mail that you could help me to create it, is-it still right ? I need just some explanation about at least how to run it, and some explanation about these parameter in the 'live.default.ini' : - every parameters for the OAI configuration - every parameters for the statistics configuration - every parameters for the Input configuration - every parameters for the ' OPTIONS FOR PUBLISHING UPDATES' Another last question, is possible to make an update just once per day or every 3 days ? Best. Julien. uHi Julien, Of course I will but is it possible to put this ~10 days back? I am pretty busy right now and I need to put my extra time in GSoC. The GSoC application period ends on May 3rd so after that it will be a lot easier for me. And yes, you can configure the update period as you wish Cheers, Dimitris On Wed, Apr 24, 2013 at 6:42 PM, Julien Plu < > wrote: uOk nice, no problem for me I can wait 10 days more :-) I will do some test first in waiting your help. Thanks again. Best. Julien. 2013/4/25 Dimitris Kontokostas < >" "live updating update?" "uNo pressure, but what's the status of streaming updates from Wikipedia into DBpedia? Is there anything I can read about it in the meantime? Regards, John This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. uJohn Muth wrote: Pressure is good! We are more or less there :-) See: We are close to going live. Kingsley uHello, the live extraction was delayed due to some reengineering, but is now in its final phase. We are shortly before an official announcement. The extraction is currently offline, because the propertienames for the OWL Axiom Annotations changed from e.g. owl:subject to owl:annotatedSource. We are waiting for the server until the old annotation graph ' If it is back online (I will post it here) you can test it at: Please do not make test edits on normal articles, as Wikipedia users get furious (vandalism). You can see the new way of engineering DBpedia here: Instance to Class Mapping: T and R-Box modelling: @John So there you can read some more about it. Regards, Sebastian Hellmann, AKSW PS: Kingsley Idehen schrieb: uSebastian Hellmann wrote: Those look very interesting. Does that mean the existing ontology owl document is deprecated? I see that birthplace has an additional owl:FunctionalProperty assertion for rdf:type. I'm definitely looking forward to the release :) uNo, the dbpedia ontology is not deprecated. We converted the information and mappings leading to the dbpedia ontology into template and will upload them to Wikipedia. So it serves as the basis, but will be editable by everyone on Wikipedia. We are waiting for the live extraction to be back online to load the templates, so that it can synchronize. Probably beginning next week. The uploading requires some manual steps, so everyone willing to help is welcome. Regards, Sebastian Hellmann, AKSW Michael Haas schrieb: uSebastian Hellmann wrote: I'm sorry, my question was unclear. Is the ontology owl document currently found in SVN outdated? I guess I could spare a few hours during the weekend. Regards, Michael" "Duplicate entries in DBpedia dumps" "uHi, I have been working with a few select files from the v3.8 dump of DBpedia, and have noticed duplicate entries in one of the files, *images_en.nt*. The entire file is 1.4 GiB, and contains 7 370 587 lines. I came across one statement, which is present in this file 10 times: < This one statement is present on lines: 2997045 3588625 5294480 5424560 5798660 5910525 6009955 6516790 6894525 7338075 Can someone tell me why this is? There may be other instances, but I've only come across this one, and I wanted to check with the community-at-large to see if this is known and/or intentional. Thanks, - A Hi, I have been working with a few select files from the v3.8 dump of DBpedia, and have noticed duplicate entries in one of the files, images_en.nt . The entire file is 1.4 GiB, and contains 7 370 587 lines. I came across one statement, which is present in this file 10 times: < A uHi Anthony, On 03/16/2013 04:54 AM, Anthony Lalande wrote: If you look at resources [1] and [2] for example, you will notice that both of them refer to the same image, i.e. \"CentralMichiganChippewas.png\". Upon running, the image extractor extracts the image along with its rights. So, the rights of that image are extracted twice. You can use the method described here [3], to remove the duplicates from the file. Hope that helps. [1] [2] [3] 20364-remove-duplicate-lines-file.html" "deal with large ontology file" "uHi, I have downloaded the dbpedia database, but I find it impossible to query against large files ,say geo_en.nt ,which is 150M in size. I can't read the whole file into my 1G memory and use rdflib to do query . Can anybody give some hint? Thanks in advance. Alex Hi, I have downloaded the dbpedia database, but I find it impossible to query against large files ,say geo_en.nt ,which is 150M in size. I can't read the whole file into my 1G memory and use rdflib to do query  . Can anybody give some hint? Thanks in advance. Alex uzhirun yu wrote: Hi, try using a triple store like Virtuoso or Jena's TDB + Joseki for SPARQL queries. uHi Alex, Which DBpedia database file have you downloaded and what database are you hosting it in ? The current live DBpedia sparql endpoint ( ) is hosted in Virtuoso: If you don't have Virtuoso already an open source version is available from: With an installer script for loading DBpedia datasets available from: General performance tuning guideline are available at: Alternatively, the the DBpedia 3.3 dataset is available as a Public AWS dataset in the cloud and can be loaded into a Virtuoso Amazon EC2 AMI in minutes at detailed at: Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: http://boards.openlinksw.com/support Twitter: http://twitter.com/OpenLink On 23 Oct 2009, at 12:38, zhirun yu wrote:" "Bls: extraction problem" "uHi Gaurav, Try to check again your extraction.de.property \"# download and extraction target dir dir=/mnt/ebs/perl/framework/extraction-framework/dump/wiki_dump # Source file. If source file name ends with .gz or .bz2, it is unzipped on the fly.  # Must exist in the directory xxwiki/20121231 and have the prefix xxwiki-20121231-.   # default: # source=pages-articles.xml # alternatives: source=pages-articles.xml.bz2 # source=pages-articles.xml.gz # use only directories that contain a 'download-complete' file? Default is false. require-download-complete=true # unqualified extractor class names are prefixed by org.dbpedia.extraction.mappings. # All 111 languages that as of 2012-05-25 have 10000 articles or more. # TODO: parse wikipedias.csv and figure out from there which languages to extract. # If no languages are given, the ones having a mapping namespace on mappings.dbpedia.org are used  languages=de extractors=InfoboxExtractor #ArticleCategoriesExtractor,CategoryLabelExtractor,ExternalLinksExtractor,\ #GeoExtractor,InfoboxExtractor,LabelExtractor,PageIdExtractor,PageLinksExtractor,\ #RedirectExtractor,RevisionIdExtractor,SkosCategoriesExtractor,WikiPageExtractor extractors.de=InfoboxExtractor #extractors.de=MappingExtractor,DisambiguationExtractor,InterLanguageLinksExtractor,RedirectExtractor,LabelExtractor #extractors.en=MappingExtractor,DisambiguationExtractor,InterLanguageLinksExtractor,RedirectExtractor,LabelExtractor # if ontology and mapping files are not given or do not exist, download info from mappings.dbpedia.org ontology=/ontology.xml mappings=/mappings # URI policies. Allowed flags: uri, generic, xml-safe. Each flag may have on of the suffixes # -subjects, -predicates, -objects, -datatype, -context to match only URIs in a certain position.  # Without a suffix, a flag matches all URI positions. uri-policy.uri=uri:en; generic:en; xml-safe-predicates:* uri-policy.iri=generic:en; xml-safe-predicates:* # File formats. Allowed flags: n-triples, n-quads, turtle-triples, turtle-quads, trix-triples, trix-quads # May be followed by a semicolon and a URI policy name. If format name ends with .gz or .bz2, files # are zipped on the fly. # NT is unreadable anyway - might as well use URIs format.nt=n-triples;uri-policy.uri #format.nq.gz=n-quads;uri-policy.uri # Turtle is much more readable - use nice IRIs format.ttl=turtle-triples;uri-policy.iri #format.tql.gz=turtle-quads;uri-policy.iri \" You write dir, so there is not base-dir in your extraction configuration.   Cheers, Riko   Riko Adi Prasetya Faculty of Computer Science Universitas Indonesia Dari: gaurav pant < > Kepada: Dikirim: Selasa, 5 Maret 2013 12:10 Judul: [Dbpedia-discussion] extraction problem Hi All, Greeting for the day I want to extract infobox properties and abstract from (pages-articles.xml.bz2).I am able to download this file using command \"/run download config=download.de.properties\" here I have configured file download.de.properties.file to download only german page-article file. Now when i am trying to extract information out from it using \"/run extraction extraction.de.property\" it is giving me below error. In extraction.de.property I have mentioned dir properly , the same which I have mentioned in download.de.properties file. Please let me know what wrong is going on?Is there any change need to be done in pom.xml of cump dir. \" [INFO] uHi Gaurav, Be patient,  I spent 4 hours for extracting Indonesian data dump. I think it is depend on host spec and size of data dump.     Yes, extracted triplets in the same source directory. Cheers, Riko Dari: gaurav pant < > Kepada: riko adi prasetya < > Dikirim: Selasa, 5 Maret 2013 12:38 Judul: Re: [Dbpedia-discussion] extraction problem Hi Riko, Thanks for your replyi have tried with that change. Its running but from a long waiting at \" Mar 05, 2013 5:13:30 AM org.dbpedia.extraction.mappings.Redirects$ loadFromCache INFO: Loading redirects from cache file /mnt/ebs/perl/framework/extraction-framework/dump/wiki_dump/dewiki/20130219/dewiki-20130219-template-redirects.obj Mar 05, 2013 5:13:30 AM org.dbpedia.extraction.mappings.Redirects$ load INFO: Will extract redirects from source for de wiki, could not load cache file '/mnt/ebs/perl/framework/extraction-framework/dump/wiki_dump/dewiki/20130219/dewiki-20130219-template-redirects.obj': java.io.FileNotFoundException: /mnt/ebs/perl/framework/extraction-framework/dump/wiki_dump/dewiki/20130219/dewiki-20130219-template-redirects.obj (No such file or directory) Mar 05, 2013 5:13:30 AM org.dbpedia.extraction.mappings.Redirects$ loadFromSource INFO: Loading redirects from source (de) Mar 05, 2013 5:28:58 AM org.dbpedia.extraction.mappings.Redirects$RedirectFinder apply WARNING: wrong redirect. page: [title=Mikrogramm;ns=0/Main/;language:wiki=de,locale=de]. found by dbpedia:   [title=Gramm;ns=0/Main/;language:wiki=de,locale=de]. found by wikipedia: [null] \" Is it because I have downloaded file page-article file manually not using dbpedia-extraction and due to this other required file could not be downloaded? Also where it will give extracted tripletsin the same source directory? On Tue, Mar 5, 2013 at 10:56 AM, riko adi prasetya < > wrote: Hi Gaurav," "linking conceptnet to Dbpedia, Yago" "uConceptnet is a commonsense database in form of a large semantic network containing 150,000 concepts and 700,000 assertions. The concepts are linked to each other by 23 predicates. ( eg. Dog isa pet ) currently i am trying to bring this dataset ( creative commons 3.0 license ) on semantic web by linking with dbpedia. However as i am new to this field i request your advice on how it can be done. Till now i have been successful in linking the good quality (more than one assertion) concepts from concpetnet with cwcc classes and yago classes. ( the concepts are linked with categories as all member of the categories can be considered as related to the concept) The data is available from here aubhat2.googlepages.com Before linking the concepts with wikipedia articles (as done in cyc dataset) i would like to get your opinions regarding the similarity metrics which can be used. uHi Akshay, The description of Conceptnet on the project page sounds like this ontology could be very useful within many projects and it would be great to have it on the Semantic Web! Frederick and Mike (cc'ed) are currently doing similar work with interlinking their UMBEL ontology Yago. I guess they would be the right people to talk to about similarity metrics. For information about publishing your data, you could have a look at Keep on the good work! Cheers Chris uHi Akshay, Chris Bizer wrote: We, too, looked closely at ConceptNet and were quite intrigued. To my knowledge, the system is not being as actively maintained as previously, and that was a concern to us. But, that was only one factor among many that caused us to select OpenCyc as our reference concept basis. [1] We welcome you to follow the UMBEL project via its Google group [2] and to look for the coming documentation of how we explicitly vetted and mapped from OpenCyc. That documentation in three-volumes, due out hopefully by the end of this month, may provide some ideas for how you might tackle a ConceptNet mapping. I would guess there is a full person-year of efforts in our UMBEL stuff to date, so don't be surprised if your own effort is quite a bit to chew! As for similarity metrics, Chris has commented elsewhere on this and we have some new UMBEL predicates that deal with similar but non-equivalent class-class alignments (isAligned), entity-class (isAbout) and entity-entity (isLike) with associated predicates for inverse properties and similarity metrics. (The UMBEL ontology documentation and reference files will also be released at the same time as the OpenCyc vetting volumes.) Applying any of these predicates may be pretty tricky. The use of similarity metrics (\"confidence intervals\" if you will), especially, is a new area, and as I have written elsewhere, possibly as likely as nuclear waste to be widely embraced. :). In the interim, you may want to look at the Dec 2007 Yago technical report [3] and its Web site [4] for one approach. You may also want to look at some of the recent SKOS discussion on these issues of similarity alignments.[5] Good luck! Thanks, Mike [1] [2] [3] [4] [5] Mapping section at end" "three problems with sparql-queries" "uHi All, I have still three problems: #1: always i test that query[1] on the DBpedia SPARQL-Endpoint ( pool size 10518528 reached the limit 10485760 bytes, try to increase the MaxMemPoolSize ini setting.\" If i simplify the query and leave out the part with the director or the producer, it works. #2: some of my queries[2] are still to slow, it takes to much time for the results, or i run in the maximum execution time. #3: whenever i use bif:contains in a query[3] and there is a blank (e.g. ?cityname bif:contains \"New York\") in the searchstring, i get this error: \"37000 Error XM029: Free-text expression, line 1: syntax error at York\" best regards, Paul Kreis Queries: [1] PREFIX foaf: PREFIX dbpedia2: SELECT DISTINCT ?filmname ?page FROM WHERE { ?film dbpedia2:name ?filmname . ?film foaf:page ?page . {?film ?property } OPTIONAL {?film ?property } {?film dbpedia2:director ?director . ?director foaf:name ?directorname . ?directorname bif:contains \"Kennedy\"} {?film dbpedia2:producer ?producer . ?producer foaf:name ?producername . ?producername bif:contains \"Spielberg\"} {?film dbpedia2:year ?year }UNION{?film dbpedia2:years ?year } FILTER (?year > \"2000\"^^xsd:integer) } ORDER BY ?filmname [2] PREFIX foaf: PREFIX rdfs: PREFIX geo: SELECT DISTINCT ?cityname ?lat ?long ?page FROM WHERE { { ?city ?property } UNION { ?city ?property } UNION { ?city ?property } ?city rdfs:label ?cityname . FILTER (LANG(?cityname) = \"en\") . ?city geo:long ?long . ?city geo:lat ?lat . ?city foaf:page ?page . ?cityname bif:contains \"Berlin\" . } ORDER BY ?cityname [3] PREFIX foaf: PREFIX rdfs: PREFIX geo: SELECT DISTINCT ?cityname ?page FROM WHERE { { ?city ?property } UNION { ?city ?property } UNION { ?city ?property } ?city rdfs:label ?cityname . FILTER (LANG(?cityname) = \"en\") . ?city foaf:page ?page . ?cityname bif:contains \"New York\" . } ORDER BY ?cityname uPaul, On 11 Apr 2008, at 19:46, Paul Kreis wrote: [Hugh] The MaxMemPoolSize for the DBpedia SPARQL Endpoint has been doubled, and I can now run query #1 although it does return an empty result set. Please re-run yourself [Hugh] I have been able to run query #2, so this may have benefited from the MaxMemPoolSize increase. Please re-run yourself > [Hugh] You should use ?cityname bif:contains \"'New York'\" , note the single quotes inside the double quotes , which worked for me. Please re-run yourself Best Regards Hugh Williams Professional Services OpenLink Software uSorry for the late response [Paul] Thanks! [Paul] I think the speed has nothing to do with the MaxMemPoolSize. It is still very slow. Maybe you can run the query with other citys (e.g. Hamburg, New York, it's slow too) because the server cashs the results and it takes only a few seconds if someone has run the query with the same city before. [Paul] That one was simple :-) Thanks a lot Paul Kreis uPaul, On 25 Apr 2008, at 18:03, Paul Kreis wrote: [Hugh] Indeed it is quite slow with the Berlin query taking abt 1min to complete, after which is returned in a few secs having been cached as you said. I ran a query with \"London\" as the city and took abt 3mins to complete, it even took a few mins when I re-ran it. Perhaps one of the DBpedia/Sparql gurus on the mailing list can comment on how the query can be optimized ??? Regards Hugh" ""Test this mapping" broken on mappings wiki" "uWhen I click on a template such as: \"Test this mapping\", I get an error of \"Service Temporarily Unavailable\". Can someone check into this? Thanks, Chris uHi, sorry, there was a problem with the server. It you work again now. Regards, Max On Fri, Oct 22, 2010 at 9:20 AM, Chris Davis < > wrote:" "United Nations FAO geo-political ontology & dbpedia (was: how to register a RDF into DBPedia)" "uHi Soonho, What do you mean by register? I see 3 options: 1. Link the countries/concepts in this ontology to their dbpedia equivalents. 2. Add this data to the dbpedia data, and have it published on mainly from wikipedia, but including a few other sources. 3. Add this data to your own copy of dbpedia to query them together, this depends on your database :) I actually had a look at the FAO ontology a while back, and you already \"link\" to dbpedia: I.e. countries have properties like: Turks_and_Caicos_Islands This means that for option 1, you have already done the hard conceptual work (i.e. figuring out what countries match). However, you have published it in your own crazy way that noone else understands :) The link should be a resource (i.e. URI) not a literal (i.e. string). It could for instance be: For option 2 - it may well be worth exploring adding the FAO country list somehow - it is quite an authorative source as to what constitutes a country (re: paul houle most justified complaint that dbpedia lists 3000 countries) Cheers, - Gunnar Do you mean link these On 27/07/10 16:53, Kim, Soonho (OEKM) wrote: uDear Gunnar; Thanks for your answer. 2. Add this data to the dbpedia data, and have it published on mainly from wikipedia, but including a few other sources. I meant the option 2. I was wondering that theremight be a way to publish the geopoliltical ontology into DBPeida. : ) If you know how the other sources were added into the DBPedia, could you introduce it to me? .e. countries have properties like: Turks_and_Caicos_Islands This means that for option 1, you have already done the hard conceptual work (i.e. figuring out what countries match). However, you have published it in your own crazy way that noone else understands :) The link should be a resource (i.e. URI) not a literal (i.e. string). It could for instance be: Thanks for your comments on it. There is a reason why we are using just DBPediaID instead of owl:sameAs. I keep maintaining the geopolitical ontology as OWL DL. If I am using owl:sameAs in OWL DL syntax, then I need to import the assigned rdf:resource (i.e. Best Regards, Soonho Kim Best Regards, Soonho Kim From: Gunnar Aastrand Grimnes [ ] Sent: Wednesday, August 11, 2010 2:36 PM To: Subject: Re: [Dbpedia-discussion] United Nations FAO geo-political ontology & dbpedia (was: how to register a RDF into DBPedia) Hi Soonho, What do you mean by register? I see 3 options: 1. Link the countries/concepts in this ontology to their dbpedia equivalents. 2. Add this data to the dbpedia data, and have it published on mainly from wikipedia, but including a few other sources. 3. Add this data to your own copy of dbpedia to query them together, this depends on your database :) I actually had a look at the FAO ontology a while back, and you already \"link\" to dbpedia: I.e. countries have properties like: Turks_and_Caicos_Islands This means that for option 1, you have already done the hard conceptual work (i.e. figuring out what countries match). However, you have published it in your own crazy way that noone else understands :) The link should be a resource (i.e. URI) not a literal (i.e. string). It could for instance be: For option 2 - it may well be worth exploring adding the FAO country list somehow - it is quite an authorative source as to what constitutes a country (re: paul houle most justified complaint that dbpedia lists 3000 countries) Cheers, - Gunnar Do you mean link these On 27/07/10 16:53, Kim, Soonho (OEKM) wrote:" "Use of Fallback Languages for Query Results" "uHi! I'm trying to build localization into a DBpedia application, but I've got a problem. I'm trying to use multilingual results from DBpedia in the follow way: - If available, show results in (for example) German. - If not available, show results in the next available language (which usually is English). SPARQL example: SELECT ?text WHERE { dbpprop:abstract ?text. OPTIONAL { FILTER langMatches( lang(?text), \"de\" ) } } The problem is that the query never chooses an available language when the preferred language is not available. For example, no results are returned if the following code is used: FILTER langMatches( lang(?text), \"he\" ) The same happens for other languages as well. Does anyone know of a solution to this? Thank you, Stephen Hatton Hi! I'm trying to build localization into a DBpedia application, but I've got a problem. I'm trying to use multilingual results from DBpedia in the follow way: If available, show results in (for example) German. If not available, show results in the next available language (which usually is English). SPARQL example: SELECT ?text WHERE { < Hatton uStephen, your query simply asks to fill ?text if language \"de\"/\"he\" is available (with the OPTIONAL keyword the query does not fail), if not, ?text will be empty. You may look at \"Matching Alternatives\"[1], but I don't think there is a way to encode \"the next available language \" into a query. But: You may use UNION to define an order of preferred languages within the WHERE part. I would probably ask OPTIONAL for both, \"de\" and \"en\" & would solve the problem at application level. Best wishes, Martin [1] Am 29.12.2009 um 21:21 schrieb Stephen Hatton:" "Content negotiation's response is terse" "uGreetings. Why does direct content negotiation only return a subset of what is provided at e.g. Thanks, Tim This is all that is returned: bash-3.2$ curl -H \"Accept: text/turtle\" -L @prefix dbr: . dbr:Neman_River owl:sameAs dbr:Neman_River . @prefix rdfs: . dbr:Neman_River rdfs:label \"Neman River\"@en . @prefix foaf: . @prefix wikipedia-en: . dbr:Neman_River foaf:isPrimaryTopicOf wikipedia-en:Neman_River . @prefix prov: . dbr:Neman_River prov:wasDerivedFrom . @prefix dbo: . dbr:Neman_River dbo:wikiPageRedirects ; dbo:wikiPageID 44543203 ; dbo:wikiPageRevisionID 635904133 . wikipedia-en:Neman_River foaf:primaryTopic dbr:Neman_River . Timothy Lebo foaf:Person. uHey Tim, You are looking for different IRIs: The reason is stated in the resource description below: A second request will give you what you are looking for: $ curl -H \"Accept: text/turtle\" -L ' These redirects are resolved within the DBpedia dataset, i.e. redirect resources do not have incoming links. But it might happen, that such IRIs do exist in external datasets, e.g. from outdated links, or user interaction. If you want to process the data automatically, you would need to follow these redirects when accessing DBpedia IRIs in order to get the data you are looking for. Best regards Magnus" "foaf:img and file location" "uHi, I've loaded a number of dbpedia files into my local mirror : articlecategories_en.nt homepage_en.nt redirect_en.nt articles_label_en.nt infobox_en.nt redirect_en.nt categories_label_en.nt infoboxproperties_en.nt shortabstract_en.nt disambiguation_en.nt longabstract_en.nt wikipage_en.nt externallinks_en.nt pagelinks_en.nt yagolink_en.nt Looking at the following resource on the local mirror I find it has no foaf:img property data : I'm assuming I've missed a file which contains this data - could anyone tell me which file is required for this data ? I'm dreading downloading and grepping through all the other files to find this property ! Thanks, Rob uHi, you may try the Image Dataset: There is also a preview Button on the Download Site to view some Triples from each Dataset. Jörg uJörg Schüppel wrote: Thats it - Thanks, I couldn't see the wood for the trees !" "Inconsistent results from sparql queries" "uHi, I'm having difficulty querying dbpedia using the sparql interface on I'm requesting the label, abstract and a list of redirects, and approx. 20% of the time I'm just getting the redirects. With an internal instance, running an older snapshot of dbpedia, I'm seeing the same problem for around 1% of queries. In every example I've checked, the dbpedia id is still current. So, here are some examples. Failing on dbpedia-live only Adolf_Hitler Chicago Dora_Bryan Emma_Thompson Ferrari Failing on snapshot only Austriamicrosystems Valery_Gergiev Lady_Rachel_Billington Failing on both G20 Darcy_Edwards This is the query I'm using, where '$lod_id' is one of the strings listed above. SELECT ?label, ?abstract, ?redirect, COUNT(?wikilink) WHERE { { ?abstract . FILTER ( langMatches( lang(?abstract), 'en') || ! langMatches (lang(?abstract),'*') ) . ?label . FILTER ( langMatches( lang(?label), 'en') || ! langMatches(lang(? label),'*') ) . } UNION { OPTIONAL { ?redirect . OPTIONAL { ?wikilink ? redirect } } } } Can anyone suggest why this might fail for valid entries? TIA Dave Spacey This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. uHave you tried using the following query? If you are doing a UNION you don't need to have two OPTIONAL clauses inside it. That may improve the performance, as the issue may be that the query is being cut off after a certain amount of computation time and you are getting partial results based on the physical file layout of the RDF database and number of RDF statements surrounding each URI. Also, why are you doing the negation test on the language? I thought that all abstracts and labels in DBpedia had language tags attached to them, so it should be okay just to do the english test. SELECT ?label, ?abstract, ?redirect, COUNT(?wikilink) WHERE { { ?abstract . FILTER ( langMatches( lang(?abstract), 'en') || ! langMatches (lang(?abstract),'*') ) . ?label . FILTER ( langMatches( lang(?label), 'en') || ! langMatches(lang(? label),'*') ) . } UNION { ?redirect . OPTIONAL { ?wikilink ? redirect } } } Cheers, Peter 2010/1/15 David Spacey < >:" "Ignored templates" "uHi guys, why is the Infobox_venue marked as ignored in the en mapping? This infobox is used for arts, cultural and music venues, e.g. Can we remove it from the ignore list? Cheers Andrea Hi guys, why is the Infobox_venue marked as ignored in the en mapping? This infobox is used for arts, cultural and music venues, e.g. Andrea" "DBpedia Extraction Framework - Problem downloading latest dumps" "uHi dbpedia-community! I'm experiencing heavy problems trying to get the extraction framework to run. The step I'm stuck at is downloading the dumps. My config-file seems to be correct as the download is started by the framework when running \"mvn scala:run\". Nevertheless the download times-out at a random state of data downloaded. Downloading this file with my browser is 10x slower than by downloading it with the framework. Downloading it with the browser results in the supposedly completely downloaded archive which is corrupted everytime since the download times out or else (The browser shows the download as completed though). At the moment it's impossible for me to get the dumps. I hope someone can please help me out since I need the most recent data at hand! Regards, David My config-file: dumpDir=K:/Work/Eclipse Workspace/DBpedia_Dumps/to_update outputDir=K:/Work/Eclipse Workspace/DBpedia_Dumps/updated updateDumps=true extractors=org.dbpedia.extraction.mappings.LabelExtractor \ org.dbpedia.extraction.mappings.WikiPageExtractor \ org.dbpedia.extraction.mappings.InfoboxExtractor \ org.dbpedia.extraction.mappings.PageLinksExtractor \ org.dbpedia.extraction.mappings.GeoExtractor extractors.en=org.dbpedia.extraction.mappings.CategoryLabelExtractor \ org.dbpedia.extraction.mappings.ArticleCategoriesExtractor \ org.dbpedia.extraction.mappings.ExternalLinksExtractor \ org.dbpedia.extraction.mappings.HomepageExtractor \ org.dbpedia.extraction.mappings.DisambiguationExtractor \ org.dbpedia.extraction.mappings.PersondataExtractor \ org.dbpedia.extraction.mappings.PndExtractor \ org.dbpedia.extraction.mappings.SkosCategoriesExtractor \ org.dbpedia.extraction.mappings.RedirectExtractor \ org.dbpedia.extraction.mappings.MappingExtractor \ org.dbpedia.extraction.mappings.PageIdExtractor \ org.dbpedia.extraction.mappings.AbstractExtractor \ org.dbpedia.extraction.mappings.RevisionIdExtractor languages=en uHi David, What about downloading with wget? Cheers, Pablo On Thu, Mar 29, 2012 at 5:33 PM, David Gösenbauer < > wrote: uHi David, Pablo is right - if you only download a few files, wget is great. :-) The old downloader was broken. I recently rewrote it, but didn't integrate it with the extraction code yet (I'm not even sure that's a good idea), so it's a separate step. Try using mvn scala:run download in the directory extraction_framework/dump. The configuration is in download.properties or directly in the pom.xml. These settings should work for you. (I hope the line breaks survive intact) # NOTE: format is not java.util.Properties, but # org.dbpedia.extraction.dump.download.Config dir=K:/Work/Eclipse Workspace/DBpedia_Dumps/to_update base= dump=commons,en:pages-articles.xml.bz2 unzip=true retry-max=5 retry-millis=10000 #the following is only needed when you download #wikipedia language editions by their article count #csv= #the following are only needed if want to run the #AbstractExtractor, which uses a local MediaWiki #installation and takes several days to run. #dump=en:image.sql.gz,imagelinks.sql.gz,langlinks.sql.gz,templatelinks.sql.gz,categorylinks.sql.gz #other= Cheers, Christopher On Thu, Mar 29, 2012 at 17:42, Pablo Mendes < > wrote: uSorry, that should have been mvn scala:run -Dlauncher=download On Fri, Mar 30, 2012 at 00:24, Jona Christopher Sahnwaldt < > wrote:" "Semantic Wikipedia work at Shanghai Jiao Tong University?" "uHi, scanning through the ISWC research track ( noticed two papers from Shanghai Jiao Tong University which seam to work in a similar direction as DBpedia: PORE: Positive-Only Relation Extraction from Wikipedia Text Gang Wang, Shanghai Jiao Tong University Yong Yu, ApexLab Haiping Zhu, Shanghai Jiao Tong University Making More Wikipedians: Facilitating Semantics Reuse for Wikipedia Authoring Linyun Fu, Shanghai Jiao Tong University Haofen Wang, Shanghai Jiao Tong University Haiping Zhu, Shanghai Jiao Tong University Huajie Zhang, Shanghai Jiao Tong University Yang Wang, Shanghai JiaoTong University Yong Yu, ApexLab I didn't find copies of the papers online yet :-( Does anybody know these guys and their work? Cheers Chris" "dbpedia sparql : how to change upper execution time limit : Virtuoso 42000 error" "uHi folks : I am querying dbpedia using sparql through python. I am facing issues with the upper execution time limit . This is the error : Code : sparql = SPARQLWrapper(\" newquery = \"DEFINE input:inference \\"skos-trans\\" \ PREFIX dcterms: \ select distinct ?cat1 ?cat2 ?cat3 ?cat4 where { \ dcterms:subject ?cat1 . \ ?cat1 skos:broaderTransitive ?cat2. \ ?cat2 skos:broaderTransitive ?cat3. \ ?cat3 skos:broaderTransitive ?cat4. \ } \" sparql.setQuery( newquery) sparql.setReturnFormat(JSON) results = sparql.query().convert() Response : Virtuoso 42000 Error The estimated execution time 5286 (sec) exceeds the limit of 3000 (sec) How to get through this problem ?? Regards Somesh Jain 4th year Undergraduate Student Department of Computer Science & Engineering IIT Kharagpur Hi folks : I am querying dbpedia using sparql through python. I am facing issues with the upper execution time limit . This is the error : Code : sparql = SPARQLWrapper(' Kharagpur" "The first template that occurs in a wiki article defines the class of the entity" "uHi guys, A recap of the issue we discussed last Friday during the telco follows. Enjoy Leipzig in the meanwhile! Cheers, Marco In the following article: 2 templates are used: 'personaggio' and 'fumetto e animazione'. As 'personaggio' comes first, the generated entity is typed as 'FictionalCharacter' i.e. 'personaggio' and lacks of the 'fumetto' type. Instead, the 2 templates should generate 2 different entities, 1 typed as 'personaggio' and the other typed as 'fumetto'. 1. We should investigate how big the problem is i.e. run analytics at least on the Italian wikipedia dump. 1a. If it's only a matter of some articles, then just modify the article source 1b. If the problem is big, then declare class disjunction i.e. owl:disjointWith axiom in the DBpedia ontology and raise an error when 2 templates map to disjoint classes Other examples: index.php?title=Alfredo_Binda&action;=edit uHi Marco, Thanks for the remainder. Currently we are short on developer time and, since you are mainly affected by this issue, can you create a script to quantify it? Ideally the script could be applied to other languages as well and get the general picture of this problem. Cheers, Dimitris On Mon, Sep 24, 2012 at 4:11 PM, Marco Fossati < > wrote:" ""Resource" missing from Ontology OWL" "uPerhaps I'm missing something, but when I attempt to load the instancetypes_en file into my system, I'm finding that everything that is typed seems to have a rdfs:type link to My understanding is that this is the base type for the dbpedia ontology, so that everything that has a known type derives from it (it's a bit like \"Topic\" in Freebase) This isn't what the OWL file shows, however, I don't see any reference to \"Resource\" in the OWL file, and the OWL file shows that toplevel types derive from owl#Thing, Place The visual view of the class hierarchy at also shows everything deriving from owl:Thing. It's not a problem for me either way, but my tools definitely complain, since they expect the types to match up with the type hierarchy. Personally I like the idea of a \"Resource\" type, since there are certain attributes that apply to all dbpedia entries, but I think they'd apply to things typed and untyped." "Semantic Publishing/Nanopublications and DBpedia URIs" "uDear all, first of all I like to thank everybody who contributed to this enormously valuable project! I have a very special use case: I like to use DBpedia URIs/IRIs to formulate RDF triples as a representation of the content of a scholarly publication in the humanities. Therefore I am especially interested in the quality of abstracts to lemmata of abstract concepts and I am discussing this with the de.wikipedia community at the moment. But I also have at least one question for you: DBpedia URIs are only useful in this context, if they are persistently linked to a timestamp and the content of the abstract given at a particular time. Only then I can recover what the author might have meant when he or she published the RDF triples. Of course, it is also interesting to see, how changing definitions, etymology, semantics influence the statement later, but still you have to reconstruct the historical dispositions. I hope I could explain myself sufficiently and I am very thankful to any comments: Did I miss that DBpedia is already doing something like this and if not: have you ever thought about this kind of persistence? Regards, Nora" "WikiXMLDB" "uHi folks! We have released WikiXMLDB, where we have parsed Wikipedia into a true XML format and put it online into Sedna XML Database. You can write XQueries against the whole Wikipedia now: There are some clear advantages of preserving (actually converting the media-wiki there first) the XML markup - you can pick out individual sections, extract data from there, etc. Of course XQuery is not that great at lots of outerjoins and any OWL stuff, so maybe there is a winning hybrid approach towards the whole deal. Also, if you find cool use-cases or ideas that rely on the XML representation of Wikipedia, please let us know! Best regards, Pavel Velikhov Sedna team Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. ;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ" "Feedback for new website and communication group" "uYes, but the content is still sparse and some pages are badly broken. I'm trying to fix the query links on this page: E.g. the first link in 4.2.1 makes this query, which obviously won't do: SELECT * WHERE { ?subject . } Could someone from Virtuoso please add these prefixes so we won't have to overload sample queries with prefixes? And remove dbpedia, dbpedia2 Thanks! (And we may need to add a few more in the near future) IMHO the most important missing page is It provided an invaluable road map to those Drupal problem: if when editing a page I click directly on a higher level in the breadcrumb, it goes to a URL like this and nothing happens: t=services-resources/datasets/dbpedia-data-set-2014%3FdialogFeatures%3Dproto col%3Dhttp goes to which is a list of datasets, including a link back to itself No actual statistics are provided by this link. have a history, e.g. ent http://dbpediawww.informatik.uni-leipzig.de/services-resources/ontology/2014 We need version history, but I think it's best to link to historic versions at http://lov.okfn.org/dataset/lov/vocabs/dbpedia-owl Currently they got dbpedia-owl_2014-07-15.n3 We can write at the LOV forum https://plus.google.com/communities/108509791366293651606 and ask them to archive at a particular date (they are very responsive). http://dbpedia.org/ontology/ serves ok but it says 4.0-SNAPSHOT of 2015-04-10. We need better-defined version numbers to come at well-defined dates. uOn 7/28/15 5:27 AM, Vladimir Alexiev wrote: Patrick/Mitko, Please apply the request above to the DBpedia instance. The presets should also go into the DBpedia VAD. Concern: This change needs to result in additional prefix to namespace mappings rather than replacements. I say that because of existing examples using the dbpedia: prefix. uCan you add the same to live.dbpedia.org? Thanks! uOn 8/16/15 9:40 PM, Vladimir Alexiev wrote: Yes. Kingsley" "2 Doctorate and 1 PostDoc position at AKSW / Uni Leipzig" "uFor collaborative research projects in the area of Linked Data technologies and Semantic Web the research group Agile Knowledge Engineering and Semantic Web (AKSW) at Universität Leipzig opens positions for: *1 Postdoctoral Researcher (TV-L E13/14)* The ideal candidate holds a doctoral degree in Computer Science or a related field and is able to combine theoretical and practical aspects in her/his work. The candidate is expected to build up a small team by successfully competing for funding, supervising doctoral students, and collaborating with industry. Fluent English communication and software technology skills are fundamental requirements. The candidate should have a background in at least one of the following fields: * semantic web technologies and linked data * knowledge representations and ontology engineering * database technologies and data integration * HCI and user interface design for Web/multimedia content The position starts as soon as possible, is open until filed and will be granted for initially two years with extension possibility. *2 Doctoral Students (50% TV-L E13 or equivalent stipend)* The ideal candidate holds a MS degree in Computer Science or related field and is able to consider both theoretical and practical implementation aspects in her/his work. Fluent English communication and programming skills are fundamental requirements. The candidate should have experience and commitment to work on a doctoral thesis in one of the following fields: * semantic web technologies and linked data * knowledge representations and ontology engineering * database technologies and data integration * HCI and user interface design for Web/multimedia content The position starts as soon as possible and will be granted for initially one year with an extension to overall 3 years. HOW TO APPLY Excellent candidates are invited to apply with: * Curriculum vitae and copies of degree certificates/transcripts, * Writing samples/copies of relevant scientific papers (e.g. thesis), * Letters of recommendation. Further information can be also found at: Please send your application in PDF format indicating in the subject 'Application for PhD/PostDoc position‘ to Further information can be also found at: Jobs" "query with no result??!!" "uHello, I found your project a great work! Thanks for working hard on that to get it this far. I am very new to DBpedia and SPARQL. I would like to use it to make a database of all visualization tools available: I run this simple query: PREFIX dbpedia0: SELECT ?Tool WHERE{ ?Tool a dbpedia0:Work.Software . ?Tool dbpedia2:genre \"Visualization\" . } I get no result?!!! I know at least one tool to have been set for dbpprop:genre to \"Visualization\" (it is Gephi)So my question is what Am I doing wrong? Many many thanks for taking some time to answer this. Parisa u0€ *†H†÷  €0€1 0 + uOn 2014-06-29 09:52:22 Parisa Noorishad < > wrote: Parisa, You might have already solved your problem by now, but I will answer anyway. Maybe it will help. There are a couple of problems with your query. The first problem is on this line: ?Tool a dbpedia0:Work.Software With the prefix dbpedia0 you've set, this is short for: ?Tool There are no matches for the combined \"Work.Software\" but there is one match for \"Work\" and one match for \"Software\". (In DBpedia, \"Software\" is a kind of \"Work\".) The second problem with your query is on this line: ?Tool dbpedia2:genre \"Visualization\" . With RDF and SPARQL, there are different kinds of literals" "Unicode characters" "uHi everyone, I'm José Aguiar and I'm using the Portuguese short abstract file but i saw there are some characters in Unicode encoding such as \uHow can i convert them to the right utf8 character? Thanks. uHi Jose, they are escaped Unicode characters you can look here for details On Fri, Mar 11, 2011 at 3:24 PM, José Aguiar < >wrote: uYes i know but how can i convert in perl to extract the valid character? :S From: Date: Fri, 11 Mar 2011 16:06:14 +0200 Subject: Re: [Dbpedia-discussion] Unicode characters To: CC: Hi Jose, they are escaped Unicode characters you can look here for details On Fri, Mar 11, 2011 at 3:24 PM, José Aguiar < > wrote: Hi everyone, I'm José Aguiar and I'm using the Portuguese short abstract file but i saw there are some characters in Unicode encoding such as \uHow can i convert them to the right utf8 character? Thanks. uJosé, I tried this: And the top result gave me this: my $unescaped2 = Unicode::Escape::unescape($str4); that what you need? By the way, is there anybody organizing the PT internationalization effort? Should we team up? Cheers, Pablo On Fri, Mar 11, 2011 at 3:08 PM, José Aguiar < >wrote: uI get this error :\ Can't locate Encode/Escape.pm in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl Date: Fri, 11 Mar 2011 17:19:43 +0100 Subject: Re: [Dbpedia-discussion] Unicode characters From: To: CC: ; José,I tried this: And the top result gave me this:my $unescaped2 = Unicode::Escape::unescape($str4); Is that what you need? By the way, is there anybody organizing the PT internationalization effort? Should we team up? Cheers,Pablo On Fri, Mar 11, 2011 at 3:08 PM, José Aguiar < > wrote: Yes i know but how can i convert in perl to extract the valid character? :S From: Date: Fri, 11 Mar 2011 16:06:14 +0200 Subject: Re: [Dbpedia-discussion] Unicode characters To: CC: Hi Jose, they are escaped Unicode characters you can look here for details On Fri, Mar 11, 2011 at 3:24 PM, José Aguiar wrote: Hi everyone, I'm José Aguiar and I'm using the Portuguese short abstract file but i saw there are some characters in Unicode encoding such as \uHow can i convert them to the right utf8 character? Thanks. u0€ *†H†÷  €0€1 0 + uHi Pablo, By the way, is there anybody organizing the PT internationalization effort? There is already a team and you are of course welcome to join. You can find some details on the Internationalization page ( The mentioned issues are solved but you can always add something new :) Internationalization discussions take part in the dbpedia-developers list as they are more technical best, Dimitris Hi Pablo, By the way, is there anybody organizing the PT internationalization effort? Should we team up?  There is already a team and you are of course welcome to join. You can find some details on the Internationalization page ( Dimitris" "Ontology browser in the MappingTool, no results." "uHi, I tried to use the MappingTool (for Dutch) and noticed that the OntologyBrowser does not show any classes etc. Am I missing a point here or is there somewhere a malfunction in the software? I am running the MappingTool in a Firefox browser. Thanks, Roland uI think I have seen the same report for Portuguese. Dimitris, does it work for Greek? Cheers Pablo On May 18, 2012 10:04 PM, \"Roland Cornelissen\" < > wrote: uNope, I remember we had this issue before but Lazaros Ioannidis with Robert Isele (cc) dealed with it. Maybe they can help. Cheers, Dimitris On Sun, May 20, 2012 at 2:54 AM, Pablo Mendes < > wrote: uHi again, I talked with Lazaros and sorted things out. We had to press \"sync ontology with mediawiki\" button and provide an admin passwod. This creates a local copy of the ontology and the password is to prevent server abuse [1]. I think it should be working now but will not contain any further ontology changes until next syncing. Pablo, can you check this and maybe setup a cron job? The mapping server is in Berlin right? [1] On Sun, May 20, 2012 at 1:41 PM, Dimitris Kontokostas < >wrote:" "probably incorrect mapping to schema.org from MusicalArtist" "uThe ontology says that MusicalArtist is a subclass of schema.org:MusicGroup. This seemed very odd to me, but then I looked at schema.org and noticed that schema.org:MusicGroup can also be for *solo* artists. However, MusicalArtist is for any musical artist, not just soloists. So this mapping still looks incorrect. peter uDear Peter For eyes of ontologists, it is well known that DBpedia ontology is incorrect, but I have never check about the contents of schema.org. Thank you for the info. I am planning to correct trustable ontologies like schema.org, but I do not know how to revise or advice the contents of schema.org. Does anyone know it? Or does anyone have interest the portal sites of trustable ontologies? Seiji Koide uschema.org is controlled by the schema.org partners, Google, Yahoo!, Bing, and Yandex. Contributions from the community are accepted, but are vetted before being added to schema.org See One problem with alignment to schema.org is that the formal meaning of the schema.org ontology is unusual and not fully explained. peter On Apr 14, 2014, at 10:08 PM, $B>.=P(B $B@?Fs(B < > wrote: uI try not to get hung up on the idea of having one right hierarchy but assume most end users will need to interpret the types that exist in the way that makes sense for what they are doing. The idea of foaf:Agent, which is a superclass of both person and organization, is a powerful concept because of properties shared by these two \"things\"; for instance, either can be a party to a lawsuit. Even in music you could say a brand like \"Michael Jackson\" is a team effort. On the other hand some people want :Flutist to be a subclass of :Person and it makes sense to say one person who plays the flute is a :Flutist but you can't say that a trio that all plays the flute is a :Flutist. Everybody has some theorem they expect the system to prove and they won't accept your axiom set unless you can prove their theorem with it. On Tue, Apr 15, 2014 at 6:51 PM, Patel-Schneider, Peter < > wrote: uHmm. Well, perhaps one could argue that there should be no hierarchy at all. Using the DBpedia ontology does commit you to a lot of things, many of them quite questionable. For example, in the DBpedia ontology churches are buildings, which is not true for many churches, and not even true for the physical location associated with many churches. This is one of the things that I think needs to be changed. peter On Apr 15, 2014, at 4:57 PM, Paul Houle < > wrote: uHi Peter, My observation is that crowdsourced knowledge bases (namely, Wikipedia, DBpedia, schema.org, Freebase, etc) can be excellent sources for the description and characterization of things and entities, but the structures that may be derived from them will by definition be incoherent at the TBox level. Exhortations to many contributors to be more coherent at a structural level are not likely, I believe, to meet with much success. The motivations of contributors and editors are most often local within the KB space. Thus, in microcosm, many parts of these KBs can look pretty good, but when the scope extends more broadly across the KB, the coherence breaks down. There aren't many advocates for structure-wide coherence. As an advocate for structure-wide coherence and one who is not afraid to wade into the fray, perhaps you can work some useful magic. I'm dubious, but I truly wish you luck. Our approach, which we have been working on for some years episodically, with another episode due shortly, is to use a coherent structure (UMBEL, in our approach, which is a faithful, simplified subset of Cyc) to provide the TBox, and then to find defensible ways to map the entity and concept information in crowdsourced KBs to that structure. We have been talking about this for so long that it is time for us to complete our initial development and put something forward that you and others can similarly scrutinize. We hope to have something useful by this summer. Thanks, Mike On 4/15/2014 6:55 PM, Patel-Schneider, Peter wrote: uI agree that it seems harder to crowdsource ontologies. However, Wikipedia seems to have a half-decent organization, so maybe it is possible. My view is that Wikipedia is succeeding, and not just in overall organization, because there are Wikipedia editors that challenge and remove incorrect and incoherent information and also spend time oncleanup tasks. I think that any crowdsourced artifact needs to have participants that spend considerable time on these tasks. It appears that the DBpedia ontologyhas not been subject to much of this kind of activity. Just today I went through the DBpedia class taxnomy and marked classes that were arguably misplaced. I found over 50 out of about 500 non-sport classes(and the sport classes probably all need some attention). The misplacement of DBpediaclasses is not justa problem with the DBpedia ontology, of course, as the DBpedia taxonomy is used to generate type statements for all DBpedia resources. The only aspect of DBpedia that makes this not quite so severe a problem is that many of the misplaced classes have few or no instances. However, some of the misplaced classes, e.g., FictionalCharacter, ChessPlayer, PokerPlayer, Saint, Religious, Monarch, Medician, Galaxy, Restaurant, Country, Grape, and Venue, have a significant number of instances. peter On 04/15/2014 07:16 PM, Mike Bergman wrote: uOn 4/15/14 8:55 PM, Patel-Schneider, Peter wrote: You can triangulate around these issues via other TBoxes that provide \"context lenses\" into DBpedia. Example, using YAGO which ultimately connects to DBpedia via an owl:sameAs relation in the ABox. [1] G2D6HBO uThe trouble isn't that the current concepts are wrong or that yours are wrong but that the word \"Church\" means a lot of things to a lot of different people. Church Buildings are important landmarks and they get drawn on maps; they're often architecturally interesting and knowing about them helps you get around and talk to people about your environment so you don't always depend on existential qualifiers like \"the one runs the big the soup kitchen\" or \"the one that gets visited frequently by the bomb squad.\" It can't possibly harm your soul to know and to name these things. It is easy to describe these things because they have properties that are easy to populate such as date of construction, architectural style, etc. A trouble with the word \"Church\" is that it is culture bound. It's fair to say that a \"synagogue\" is a \"Jewish church\" and a \"mosque\" is a \"Muslim church\" and it's not unusual to see buildings that have been used by one Abrahamic faith be used for another. It's not unusual for the same worship space to be used by people of twelve or so faiths, Abrahamic or not. In the 'en' cultural zone I'd expect people to accept that a Shinto Temple which isn't too different from a Christian church in structure and function, is a 'Shinto Church'. In other cultural zones, a good word needs to be picked and a similar meaning promulgated. \"Religious Building\" could be determined by deed (in cultural zones that have a well-defined system of land title), but I know plenty of churches that own the building the pastor (and possibly his entourage) lives in and I wouldn't usually call that a 'Church', and some people wouldn't accept that the building that the church rents out for birthday parties and regularly runs a bingo parlor in is a 'Church'. For each large Shinto Temple there are perhaps 100s of small shrines that have a similar role in the landscape as larger churches (spiritually important for the participant of this place-based religion), yet don't have weekly or even monthly organized ceremonies at them. What you're talking about has something to do with \"Religious Organization\", but I don't think the Knights of Columbus or the Baptist Student Union are \"Church(s)\" even though in some sense the KoC is part of the Roman Catholic \"Church\". Maybe what you mean is \"Congregation?\" With this you can model that the Irish go to mass at 10am at St. Mary's and that the Italians go at noon. That three protestant denominations hold services at one 'Church' and that the Rabbi and his flock were there the previous day? Or that there exists, somewhere, a coven (with no name) that meets every full moon, preferably outdoors, and does a ritual that isn't particularly scandalous but is nevertheless secret and won't be documented as an instance in Wikipedia? The trouble will that for DBpedia is that what we work with is the output of the Wikipedians and their world view. A satisfying set of \"Church Building(s)\" can be easily had from the source material, but if you want a greatly improved and practical model of religious practice you'll need to be able to \"agree to disagree\" with Wikipedia. On Tue, Apr 15, 2014 at 8:55 PM, Patel-Schneider, Peter < > wrote: uOn 4/16/14 12:36 PM, Paul Houle wrote: +1 \"Context Lenses\" are inherently fluid. This is ground zero, in a nutshell, when dealing with data. Yep! Yes, sometimes this vital point gets lost. Which is ultimately achievable via a different set of \"Context Lenses\" (aka Ontology or \"World View\") . That this is possible, in a loosely coupled fashion, is one of the most compelling (demonstrable) virtues of the entire Semantic Web vision. Kingsley uDeferring to Wikipedia is definitely one of the things that DBpedia should do, but that cannot be done completely blindly. For example, Guantanamo Naval Base uses the military structure infobox, but there is no way that the entire base can be considered to be a military structure.Either a different infobox should be used in Wikipedia or the mapping rule in DBpedia adjusted. On the other hand, the subclass relationship between military structure and building is entirely from DBpedia. Military structures are not necessarily buildings, so the DBpedia ontology should becorrected here. The situation with Church and its siblings isindicative of the choices faced. Wikipedia defines church as\"a religious institution, place of worship, or group of worshipers, usually Christian\". None of these are necessarily buildings, but many Christian places of worship are buildings. The instances of Church in DBpedia are largely populated from the French Wikipedia infobox Édificereligieuxwith type containing \"glise\". These entities all appear to be buildings.So in the end it seems reasonable to defer to Wikipedia here, and use the Church class for church buildings. Changing the name of the class to Church_(building) should also be done, but it may be that this is not worth worrying about, aslong as there is a comment that Church is for church buildings. Monasteryis populated in a very similar manner, but quite a few monasteries in French Wikipedia are not buildings, and not even architectural structures. To fix this problem requires moving Monastery outof Buildingand even out of ArchitecturalStructure. This is a change that I feel must be made, even though it goes counter to the indications in the infobox name. peter On 04/16/2014 09:36 AM, Paul Houle wrote: uI agree that a way forward would be to use a well-defined ontology, and set up themappings from infoboxes to map into that ontologyinstead of the DBpedia ontology. This would require, I think, a more powerfulmapping technology, one that can use arbitrary combinations of the infobox information to determine the correct classes to map an infobox to. However, another way forward would be to continue to use a small ontology, essentially a fixed version of the DBpedia ontology. A small ontology is better here, I think, than a large ontology, because a large ontology presents too many choices for mappings. However, it may be that a large ontology is requiredbecause each language needs slightly different targets for many of its infoboxes. peter On 04/15/2014 07:16 PM, Mike Bergman wrote:" "dbpprop:redirect dbpedia:About:_URI_scheme" "uHello DBpedia people, We are seeing the following oddity in the data at : 'Opera' redirects to 'About:_URI_scheme'. Here's our query: PREFIX dbprop: select ?link, ?redirect WHERE { dbprop:redirect ?link . OPTIONAL {?link dbprop:redirect ?redirect} } Here's the JSON response at time of writing: { \"link\": { \"type\": \"uri\", \"value\": \" Is it safe to assume that that just happened to be the state of the Opera Wikipedia page at the moment the dump from which that data was extracted was created? And that it will remain that way until the next dump and extraction? Thanks, John Muth This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. uHi John, On Tue, Jun 15, 2010 at 1:19 PM, John Muth < > wrote: The reason for the extraction of the triple was not the Opera Wikipedia article being a redirect at the time of the extraction (see To be honest, after looking at this case, I cannot see how this triple got extracted. Also, I was not able to reproduce the error with the current version of the extraction and the Wikipedia version of today. Therefore, you can assume that the next DBpedia release will fix this problem. Regards, max" "dbpedia down - bif:contains at fault?" "uHello! MacTed asked me on #semsol to submit my dbpedia SPARQL queries here, because maybe they are somehow connected to the problem of dbpedia going down very often after your recent upgrade. I use your web interface for some very harmless queries, but maybe there is a problem with virtuoso's full text indexing extension, as every time I use bif:contains in a query, dbpedia is immediately gone afterwards. The query in question is always like this: prefix dbpedia2: prefix foaf: prefix rdfs: SELECT ?entity ?abstract WHERE { ?entity dbpedia2:abstract ?abstract . ?entity rdfs:label ?label . ?label bif:contains \"'Ludwig Wittgenstein*'\" . FILTER (lang(?abstract )='de') . } LIMIT 20 Best regards, Thomas Hello! MacTed asked me on #semsol to submit my dbpedia SPARQL queries here, because maybe they are somehow connected to the problem of dbpedia going down very often after your recent upgrade. I use your web interface for some very harmless queries, but maybe there is a problem with virtuoso's full text indexing extension, as every time I use bif:contains in a query, dbpedia is immediately gone afterwards. The query in question is always like this: prefix dbpedia2: < Thomas uHi Thomas, Thanks for the information, the cause of this issue is already being looked into by development and we should have a resolution soon at which point a post will be made to the mailing list notifying everyone Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 2 Sep 2008, at 14:07, Thomas Schandl wrote: uThomas Schandl wrote: So Virtuoso is taking \"Wovon mann nicht sprechen kann, daruber muss mann schweigen\" a bit too seriously ;) On a more serious note, I just heard word it seems to not be recent code that's causing this. A patch should be underway soon. Yrjänä uHi Thomas, This issue has now been resolved, thus your query below runs successfully against the dbpedia server. Please confirm this is now functioning as expected for you Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 2 Sep 2008, at 14:30, Hugh Williams wrote: uHi, Great, it works - and is very fast, too! Did these queries really cause the problems? Is it also possible to use SCORE_LIMIT in SPARQL queries like mine? If yes, can you give me an example, I don't know how the syntax for that should look and what the upper and lower limits for the value is. Regards, Thomas On Tue, Sep 2, 2008 at 8:18 PM, Hugh Williams < >wrote: uHi Thomas Yes, those forms of queries where causing the problem which was an issue in the Virtuoso Server and has now been fixed. SCORE_LIMIT is not supported in the Virtuoso SPARQL implementation or the SPARQL specification itself from what I can see, but is an extension we shall consider adding at some point. Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 2 Sep 2008, at 21:04, Thomas Schandl wrote:" "Missing file at "generate-settings"" "uHi all, There seems to be a problem running the \"generate-settings\" script, as Wikimedia has removed the file * on. (it contains a list of wikipedia languages) Has any of you run into the same problem, or am I missing something here? (I'm using the \"dump\" branch of the extraction framework. Is it still the most updated branch to work with? How do I find out in case this changes?) Anyway, for the moment, I'm using a cached copy of that file (available from Google), and I sent an email to with the hope they'll put that file back where it was. Cheers, Omri *Omri Oren* Algorithm Engineer +972-54-786-2659 < > visit us at corp.everything.me" "URIs with "<" in them confusing Virtuoso and Jena" "uDOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" This is similar to the recent ampersand issue. An old URI RFC (section 2.4.3) states that angle brackets are illegal in URIs, but the current spec and RFC 3986 seem to allow them(!). The dbpedia3.1 externallinks_en.nt file has several URIs with \"<\" which is leading to confusion for both Virtuoso and Jena. For example, at ?p ?o } remain in light, cws uHi, Chris uHi Ted, You are to be thanked (Thank you!) for continued clear, lucid and thorough discussions of these matters. I suspect your research takes much time, and the formulation of your responses do as well, but for me, these are massively helpful inputs to the community. Please keep up the great work! Mike Ted Thibodeau Jr wrote: u* Ted Thibodeau Jr [2008/08/25 09:26 PM -0400] wrote: Whoops! My bad uHi Ted, nice analysis. Ted Thibodeau Jr wrote: Are you using 3.1? I found ~16 using grep \" ]*<[^>]*>\" filename Indeed. uChris, Ted, On 25 Aug 2008, at 23:44, Chris Schumacher wrote: No, it does not allow them. Angle brackets are illegal in URIs. The URI grammar in RFC 3986 explicitly lists all characters that are allowed in a URI. If a character is not listed, it is obviously not allowed. Angle brackets are not listed and thus not allowed. Check the handy summary of the URI grammar on pages 48/49. Makes it easy to trace which characters are allowed where. (As with any character that is not allowed, it can be %-encoded, and the resulting triplets (%3C, %3E) can be included in a URI.) On 26 Aug 2008, at 02:26, Ted Thibodeau Jr wrote: Wrong. They are not valid. uRichard Cyganiak wrote: That's good to know; so instead of a processing bug, it's just a dataset bug, or maybe a bug in the dbpedia URI checker. Looking upstream, it might be nice if wikipedia restricted url submission to legal urls since they get so widely replicated. I'm sure they've considered it I've noticed that the dbpedia 3.1 data is much cleaner than the 3.0, which is greatly appreciated. I suppose I'll set up a strict URI checker before importing, and just reject the triple completely if anything is amiss. I should be doing that anyway. thanks!!" "Links from DBpedia to Geonames" "uHi folks Geonames contains about 400,000 references to DBpedia entities (this is about 5% of the overall Geonames features), asserted by rdfs:seeAlso links. See e.g., The links to DBpedia are actually inferred from links to Wikipedia, continually added by Geonames. Geonames does not use owl:sameAs links any more, because those were leading to crazy inferences, merging \"from outside\" features which are kept distinct inside Geonames. DBpedia uses owl:sameAs links to Geonames, under its own responsibilty :) But there are far less links from DBpedia to Geonames than the other way round, and my question is : where do the links from DBpedia to Geonames come from, since they are not, most of the time, present in Wikipedia data? Thanks for any clarification! uHello, I think the owl:sameAs links to external datasets are made by Silk when a new release is created. See \"credits\" since this webpage : Best. Julien. 2013/9/12 Bernard Vatant < > uHello Julien Thanks, but I don't see Geonames listed in external datasets in the page you mention Bernard 2013/9/12 Julien Plu < > uMaybe they have a list of Silk configuration file who links many datasets to DBpedia that they run before to release each DBpedia version like this one : Who links DBpedia and their Geonames dataset loaded in a home triple store. 2013/9/12 Bernard Vatant < > uHi Bernard, Maybe these links could help: 1- 2- HTH Ghislain Hi Bernard, Maybe they have a list of Silk configuration file who links many datasets to DBpedia that they run before to release each DBpedia version like this one : Ghislain uJulien Whatever the method, features which have been for long ago in Geonames are not linked from DBpedia, although Geonames has had a Wikipedia/DBpedia link for ages in its description. Example : Bernard 2013/9/12 Julien Plu < > uOn Thu, Sep 12, 2013 at 1:47 PM, Bernard Vatant < >wrote: Those two things have different names \"Mount\" vs \"Mont\" (does geonames nave multilingual support?) so it's probably doing simplistic name matching. The config files for mountains is here if you want to improve it: Tom On Thu, Sep 12, 2013 at 1:47 PM, Bernard Vatant < > wrote: Whatever the method, features which have been for long ago in Geonames are not linked from DBpedia, although Geonames has had a Wikipedia/DBpedia link for ages in its description. Example : Tom uIn your example there no sameAs link from DBpedia to Geonames certainly because Silk didn't found any match between them. Wheras if you take this URI : So as Tom said it's certainly because of language issue. Best. Julien. 2013/9/12 Tom Morris < > uHi Tom Thanks for the pointer, which makes me wonder. Since Geonames puts a lot of work at building quality links to Wikipedia, why does not DBpedia simply leverage this work and use those links? They are available in the RDF, see e.g., all individual RDF descriptions, hence heavy and verbose). It is simpler to parse the alternate names file which contains the wikipedia links as \"alternate names\" with the \"language\" link. Quite ad hoc, but efficient. Whichever way, you don't need any fine-tuned matching algorithm, just harvest 400,000 triples that are already there. Best Bernard 2013/9/12 Tom Morris < > u0€ *†H†÷  €0€1 0 + uHi, I downloaded alternateNames.zip (thanks for the link, Bernard) and ran a few bash/sed/awk commands to generate new links from DBpedia to geonames. I encountered a few minor encoding issues etc., but nothing serious. The new links will be available for download and querying soon. Cheers, JC On 12 September 2013 23:47, Kingsley Idehen < > wrote: uHi Jona Good news! Please ping me when the new data are available through the SPARQL endpoint! Best Bernard 2013/9/19 Jona Christopher Sahnwaldt < > uHi Bernard, I don't know when they'll be available in the SPARQL endpoint, but you can already download them from A smaller number of links is also available for other languages, e.g. Cheers, Christopher On Sep 20, 2013 10:14 AM, \"Bernard Vatant\" < > wrote:" "Missing properties after mapping extraction" "uHi everybody, I encountered another problem after defining new mappings for dbpedia. I created a German mapping \"Ort in Argentinien\": As you can see it includes mappings for the properties \"Provinz\" and \"URL\". Unfortunately when I test the mapping the mapped properties province and wikiPageExternalLink are not within the results, at all. Furthermore if you have a look at the results of Ushaia and compare them with the source text of the corresponding article missing properties: department (Departamento) and leaderName (Bürgermeister). Does anybody spot the root of all evil in this example? Cheers, Bastian uHi Bastian, the \"evil\" thing here is easy to explain: {{Infobox Ort in Argentinien |Name=Ushuaia |Provinz=Tierra del Fuego |Bürgermeister=Federico Sciurano |URL= }} With a look at the infobox of Ushuaia you can see that the property values don't contain wikilinks. \"Provinz\" is mapped to the OntologyProperty:County, \"Bürgermeister\" is mapped to OntologyProperty:LeaderName and URL is mapped to OntologyProperty:WikiPageExternalLink (personally, I would prefer foaf:homepage btw). All of these three ontology properties are object properties, therefore they expect a wikilink. The \"Infobox Ort in Argentinien\" converts the provinz and the url value into links, so that at the rendered wikipeda page you found the internal link to The DBpedia framework extracts the infobox property values and not the information that is rendered from it, therefore that information is lost. regards, paul uHi Paul, I don't know if I got you right. Do you mean that merely such properties can be extracted: [[some value]]? If that is true I wonder why also simple values (without the brackets) are extracted in other cases. For instance in |Lage=Südafrika is evaluated and the information does not vanish during the extraction process. Just test the mapping at Cheers, Bastian uHi Bastian, generally, if an infobox property is mapped to an ontology object property, the DBpedia framework \"searches\" for Wikilinks in the infobox property value. For instance a fictitious property: | country = World cup 2010 in [[South Africa]]. If this property is mapped to an ontology object property, the DBpedia framework would extract a triple with an object: . If the infobox property is mapped to an ontology datatype property with xsd:string as range, the framework would extract a triple with an literal as object: \"World cup 2010 in South Africa\"@en . In a case where an infobox value doesn't contain a link, but is mapped to an ontology object property, normally nothing would be extracted. But the framework do some simple entity recognition. In your example: \"|Lage=Südafrika\" a Wikipedia article exists that has exactly the name of the property value: regards, Paul uHi Paul, ok now I get it. With your explanation in mind I'm just wondering how to circumvent the obstacles set by articles that use infoboxes like \"Ort_in_Argentinien\". The problem is that when using the template one has to select a specific province which is defined in a list of provinces: | [[Liste der Provinzen Argentiniens|Provinz]]: | {{#switch: {{{Provinz|}}} |Salta = [[Salta (Provinz)|Salta]] |Buenos Aires (Provinz) = [[Buenos Aires (Provinz)|Buenos Aires]] (Provinz) When these provinces' names do not correspond to the actual titles of the wikipedia articles the extraction of the specific property fails. So what possibilities do we mappers have to handle that problem? Or is it inevitable without restricting the syntax of infoboxes? Regards, Bastian 2011/4/4 Paul Kreis < >: uHi Bastian, yes, you hit on a weakness of the current mapping language. Right now, a mapper could do nothing. The mapping language doesn't provide a sufficient solution for this problem. We have this issue on our to-do list. A possibility could be to transfer the \"{{#switch: {{{Provinz|}}}\"-template with its abbreviations to a conditional mapping that defines constant values to ontology properties. regards, paul" "Questions about DBpedia and SPARQL" "uHi everyone, my name is Piero Molino and I'm a student form the University of Bary, Computer Science department. For my degree thesis I'm working on an ontology based retrieving algorithm and, after seeking for while, i decided to use DBpedia as my multi-domain ontology. Anyway I'm at the real beginning with SPARQL (links to books/tutorials/anything usefull are realy really welcome) and probably what now seems to me to be a problem really isn't: while I figured out how my algorithm would make inference over DBpedia, I'm lacking of the first step (wich isn't really what my thesis is about, so i can reuse someone else approach to it). Basically i have a list of words and i have to map each of them to a DBpedia resource. I trying to figure out how i can do it, i thought i could start taking a look at the free text search from the DBpedia website ( ) but for each example query I get a javascript+html response and an empty result. Isn't it working or it's me i can't undestand the results in the right way? Anyway simple text search wouldn't be enought because of disambiguation issues, so i thought i can use Gabrlovich's ESA ( ) to retrieve a wikipedia page for each word in the list and then get the DBpedia resource relative to the wikipedia page. Because of my actual lack of expeience over SPARQL i don't know if there is a simplier way to achieve the same result. I would really appreciate if someone could help me both in extendind my SPARQL knowledge and in finding a better and simplier solution for the problem i'm trying to solve. Thank you, Piero Molino uHello, Piero Molino wrote: If you want to get from a string to a set of URIs, you can use the DBpedia Lookup service: The API doc is here: You could also query the SPARQL endpoint directly using the build in bif:contains function of Virtuoso. Actually, it is very simple to get from Wikipedia URLs to DBpedia URIs by using the lookup service above or modifying the URLs (which is after all what we do when extracting data from Wikipedia: we use replaced by \"/\" and %3A replaced by \":\"). For general advice on SPARQL documentation, tutorials etc., this probably isn't the right group (please ask at the W3C Semantic Web mailing list, but make sure to search the web first). Kind regards, Jens" "Six properties for a person's date of birth" "uHi, Just joined this list, and noticed a post from Tim Finin, who asked exactly the question that occurred to me on browsing through a typical dbPedia page (for Edinburgh). That is, that dates of birth and death appear as different properties: p:birthplace p:birthPlace p:cityofbirth etc. I can see how this diversity arises, due to the harvesting approach used. However, (and I'm probably showing my ignorance here) couldn't owl:sameAs statements be added to the database to indicate that these are all actually the \"same\" property? p:birthplace owl:sameAs p:birthPlace etc. And having added these equivalence statements, couldn't SPARQL queries be phrased that would actually pick out all the people who were born in a particular place? Richard Light uRichard, On 15 May 2008, at 16:46, Richard Light wrote: Yes, this could be done. The problem is that Wikipedia is huge, and there are tens of thousands of properties covering all sorts of domain. Hence we would need either an automated approach to find those duplicated properties, or lots of volunteers who go through the dataset and find them manually. Yes, this would be possible. Best, Richard uIn message < >, Richard Cyganiak < > writes Presumably it's possible to put in a SPARQL query to find all the properties that have been used? All the properties that relate to people? I'm coming into this from a historical perspective (specifically museums), and my interest would be in sorting out the properties which relate to: - people - places - events - dates - [museum] objects Would this be a manageable subset? If so, I'm willing to help. It would be interesting, additionally, to map these properties to those in the CIDOC Conceptual Reference Model. Richard uHow would you know which properties related to people - you can find all the properties for something that is a person but that doesn't mean they apply *just* to people ? Me too, we've been doing some work in this area and have some properties which we think define people and companies (places next). Whats the best way to go about getting this info into dbpedia ? Rob It uRichard, On 16 May 2008, at 14:16, Richard Light wrote: You can get an idea of the properties used for a particular kind of resource by following these steps: 1. Find the class(es) used in DBpedia of that kind of thing. For example, if you look at a couple of persons, you will find that they usually have the classes yago:Person100007846 and foaf:Person. Classes in DBpedia are quite inconsistent, so this is not an easy task. 2. Use a SPARQL query like this to find the properties used for any particular class: SELECT DISTINCT ?p WHERE { ?s ?p ?o . ?s rdf:type . } You can run the query at 3. Use a SPARQL query like this to find out how often a particular property is used (to see if it's worth spending time on it): SELECT COUNT(*) WHERE { ?s dbpedia2:dateOfBirth ?o . } 4. Find example triples that use a particular property (to see *how* it's used, and if it's indeed interchangeable with another property): SELECT ?s ?o WHERE { ?s dbpedia2:dateOfBirth ?o . } The end result of this process could be triples like this: dbpedia2:dateOfBirth owl:equivalentProperty dbpedia2:birthdate . We could load this into the SPARQL endpoint then. Best, Richard uI've used the two you give: this gives two sets with a large degree of overlap and not that many new cases for the second class (foaf:Person). I suspect it be a case of diminishing returns to hunt for any more classes used to identify people. Done that. I found (by asking for an ordered result set) that these queries returned more than the allowed number of hits, so I added regex filters looking for: born birth died death place date and copied the resulting XML into a single document. I then sorted this, stripped out duplicates (e.g. placeOfDeath would be found by two of these searches), and hand-crafted the attached XML document, which attempts to group concepts together. This establishes that the \"obvious\" ones vastly outnumber the oddities. This is a bit depressing, I must say Still, I suppose you can still query the ones which aren't total cabbage. I assume that you would want there to be strict equivalence between the properties thus linked? I'm assuming so, in which case the number of equivalences will be quite small. There are part/whole type relationships between some of them, e.g. monthofbirth and dateofbirth cityofbirth and placeofbirth Can this type of relationship between properties be [usefully] expressed? Final question (for now): there are for example seven variants on \"birthDate\". Do I need six owl:equivalentProperty statements or 21? Richard xml version='1.0' encoding='%SOUP-ENCODING%' popplace uOn 16 May 2008, at 22:06, Richard Light wrote: I think another way to deal with this could be to use LIMIT and OFFSET, so you could start with: SELECT WHERE { } LIMIT 1000 OFFSET 0 And then increase the OFFSET in steps of 1000. I'm not sure. None of the properties have a strict formal definition, so it's hard to say if two properties are equivalent or not. Perhaps a good rule of thumb would be: If I query for either of the two properties, would the additional values returned from the other property improve the result? If it would, then we could assume equivalence. The first one not. In the second case, you can say: :cityofbirth rdfs:subPropertyOf :placeofbirth . Formally, this states: \"If someone's :cityofbirth is X, then their :placeofbirth if also X\", which makes sense. Note that it doesn't necessarily work the other way round. If it worked either way, then the properties would be equivalent. A good question. In terms of formal semantics, six are enough, and the others are implied by the OWL semantics. An OWL reasoner would would take care of that automatically. But for SPARQL queries against the DBpedia endpoint, which doesn't have reasoning built-in, it would make things simpler if we had all the equivalences. I propose to leave it at six for now, but would really like to hear what others think about this question. Richard uRichard Cyganiak wrote: Richard, Note that Virtuoso does support inferencing for subclass and subproperty. The issue here is that no inferencing rules have been requested or created during the various data loads into Virtuoso. See: Kingsley uOK: attached is a text document containing what I think are 49 correctly-framed declarations of equivalence and sub-property-ness for various birth and death date and place properties. Are the equivalences the most helpful way round? (The \"standard\" property is on the right of the expression.) If not, it's trivial to change the XSLT to swop them round. More importantly, given Kingsley's comments, can we now do anything useful with these statements? The implication is that subPropertyOf _could_ be supported, but hasn't yet, and the doc he refers to makes no mention at all of owl:equivalentProperty (only owl:sameAs, which relates to classes not properties). Richard (Light) http://dbpedia.org/property/birthyrProperty owl:equivalentProperty http://dbpedia.org/property/birthYear . http://dbpedia.org/property/yearofbirth owl:equivalentProperty http://dbpedia.org/property/birthYear . http://dbpedia.org/property/birhplace owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/birthlocation owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/birthPlace owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/birthplace owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/bornIn owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/bplace owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/irthplace owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/placeBirth owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/placebirth owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/placeOfBirth owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/placeOfBirtht owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/placeofbirth owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/placeOfIrth owl:equivalentProperty http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/birthcity rdfs:subPropertyOf http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/birthcountry rdfs:subPropertyOf http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/cityofbirth rdfs:subPropertyOf http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/countryofbirth rdfs:subPropertyOf http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/countyofbirth rdfs:subPropertyOf http://dbpedia.org/property/birthPlace . http://dbpedia.org/property/dateDeath owl:equivalentProperty http://dbpedia.org/property/deathDate . http://dbpedia.org/property/datedeath owl:equivalentProperty http://dbpedia.org/property/deathDate . http://dbpedia.org/property/dateOfDeath owl:equivalentProperty http://dbpedia.org/property/deathDate . http://dbpedia.org/property/dateofdeath owl:equivalentProperty http://dbpedia.org/property/deathDate . http://dbpedia.org/property/deathDate owl:equivalentProperty http://dbpedia.org/property/deathDate . http://dbpedia.org/property/deathdate owl:equivalentProperty http://dbpedia.org/property/deathDate . http://dbpedia.org/property/deathDateProperty owl:equivalentProperty http://dbpedia.org/property/deathDate . http://dbpedia.org/property/deathOfDate owl:equivalentProperty http://dbpedia.org/property/deathDate . http://dbpedia.org/property/deathyrProperty owl:equivalentProperty http://dbpedia.org/property/deathYear . http://dbpedia.org/property/yearofdeath owl:equivalentProperty http://dbpedia.org/property/deathYear . http://dbpedia.org/property/deathlocation owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/deathPlace owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/deathplace owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/diedIn owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/diedPlace owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/placeDeath owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/placedeath owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/placeOfDeath owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/placeofdeath owl:equivalentProperty http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/cityofdeath rdfs:subPropertyOf http://dbpedia.org/property/deathPlace . http://dbpedia.org/property/countryofdeath rdfs:subPropertyOf http://dbpedia.org/property/deathPlace . In message < >, Richard Cyganiak < > writes uRichard Light wrote: Richard, We can inference over rules for classes and subclasses. I'll look at what you've sent when I have a moment. If you've sent triples then we can create rules etc If all is done right we might even make a nice tutorial out of this whole thing. Kingsley uIn message < >, Kingsley Idehen < > writes Sounds good. If I've got it slightly wrong, let me know and I'll sort it out. The property mappings are an XML document, so it's trivial to re-express them differently if I've misunderstood the required format. Is it a worthwhile project to consider the overall ontology represented by the extracted WP classes and properties, and to work out how to improve retrieval in dbpedia by (a) adding this sort of equivalence mapping and (b) recommending \"best practice\" for class and property values in WP templates, to improve the quality of what comes in in the first place? Richard uRichard Light wrote: Meant to say subclasses & subproperties. Nothing else to do, we have eough to experiment with etc. uHi Kingsley, Kingsley Idehen wrote: Having this lightweight inference support for subclasses and subproperties would be great. We tried it locally on one of our own machines by loading DBpedia into Virtuoso and creating a rule. However, we ran out of memory while the rule was created (the machine has 4GB RAM - there may have been other tools consuming some of the memory). Do you have an idea how much memory is required to load and run DBpedia in Virtuoso with inference support for subclasses and subproperties? Would it be possible to include inference support in the public SPARQL endpoint? If I understand correctly, there shoudn't be any significant performance drawback for SPARQL queries, which do not explicity ask for inference support (using \"sparql define input:inference $ruleset\"). Kind regards, Jens uJens Lehmann wrote: Jens, We are completing a similar test here based on my response earlier this week. I assume you are trying to load the Yago Class Hierarchy? If so, let us finish our investigation and then we will have a proper report :-) We should have leveraged Virtuoso's inferencing capabilities from the get go :-) uRichard Light wrote: Richard, Here is the tutorial dump, and this has been applied to the live instance, so please test. There is more to come re. this matter as we are going to load the Yago Class Hierarchy once we iron out some problems . uHello, Kingsley Idehen wrote: Was the test successful? Yago was already loaded into our local Virtuoso instance at that time. Then we used rdfs_rule_set (' or something similar to enable inferencing, which returned an out of memory error after a couple of hours. Kind regards, Jens uKingsley, Thanks for doing this: I'll have a play next week when I have a bit more time. I notice that you have set up the variants as sub-properties, rather than say owl:sameAs. Does this imply that we have, in effect, one \"preferred term\" and N \"non-preferred terms\" for each property? If so, is there merit in developing a set of \"preferred names/URIs\" for the most widely-used properties extracted from Wikipedia? The existence of a preferred set of URIs for these properties would help others to harmonize their practice and improve interoperability. Richard In message < >, Kingsley Idehen < > writes uRichard Light wrote: Correct. Yes. This is basically what we achieve via the inference rules. Yes-ish. I say so because the UMBEL will also address some of these matters via it's ontology aspect. Likewise the upload of the Yago Classes used in the current DBpedia release. That said, if you have the time, no harm approaching it your way, all we have to do is simply add more rules or add to the existing rules. Yes. Kingsley uJens Lehmann wrote: Jens, The is an issue with rules loading that Yago exposed. We will soon be done with the issue. Once resolved, we will apply to the public instance and then produce a note about configuration etc uHello, Richard Light schrieb: We added a simple ini file for different spelling of the same property in infoboxes: When a property on the left hand side is encountered in infoboxes, it will be rewritten to the one on the right hand side. Of course, this handles only a few cases manually. If you have any further additions to this file, drop me a message. Kind regards, Jens" "Important News Re. IBM Watson and DBpedia" "uAll, I've just got official confirmation from IBM about DBpedia being a critical part of the Watson & Jeopardy project [1]. In plain English, to the DBpedia community, this means: you are free to say: Watson is using DBpedia i.e., a great demonstration of what DBpedia enables! IBM will work their way through attribution in their collateral, starting with project web page etc, but for now, I've been told IBM officially acknowledges the use of DBpedia re. Watson. Great stuff everyone!! Links: 1. 2. a-system-designed-for-answers.html" "six properties for a person s date of birth" "uHi Tim, Interesting use case! which is a known issue of DBpedia uGeorgi Kobilarov wrote: Ah" "Querying wiki/dbpedia for presidents' ages at inauguration" "uI did a blog entry today about showing Jon Udell how DBPedia can do a lot more than he realized: Bob DuCharme" "Link rel/rev lower case" "uHello, DbPedia gurus! Could you please clarify why rel and rev attributes values are written in lower case? For example, I go to Regards, Alexander Hello, DbPedia gurus! Could you please clarify why rel and rev attributes values are written in lower case? For example, I go to Alexander uOn 9/25/10 11:26 AM, Alexander Sidorov wrote:" "@devs: Please add Indonesian namespace" "uHi Riko, it's great that you want to start Indonesian mappings! I'm sorry that there is no Indonesian namespace yet. Only developers can add a namespace and it takes some time (an hour or so). @DBpedia developers: I'd like to do it, but don't know if I will have time in the next few days. Whoever has time, please shout out and go ahead. The process is described here: Cheers, JC uback to list To whom it may concern: Please add Indonesian (id) and Urdu (ur) namespaces. (Adding two namespaces at the same time is hardly more work than adding one namespace.) JC On Thu, Feb 28, 2013 at 2:23 PM, Hamza Asad < > wrote: uUrdu Means Roman Urdu (Which is written in english format). Its Humble Request Regards On Thu, Feb 28, 2013 at 6:39 PM, Jona Christopher Sahnwaldt < uHi Hamza, I 'll try to have this ready by the beginning of next week. I'll let you know how it goes Best, Dimitris On Fri, Mar 1, 2013 at 7:45 AM, Hamza Asad < > wrote: uHi Riko, You can start creating mappings for Indonesian and Urdu :-) There are no statistics for now but we plan to add them shortly Best, Dimitris On Fri, Mar 1, 2013 at 2:04 PM, riko adi prasetya < >wrote: uHi, I was trying to generate statistics locally and had to set InfoboxExtractorConfig.extractTemplateStatistics to true to have the xxwikiinfobox-test.ttl file generated. As far as I can read from the code, this file is needed to create statistics for the count of properties used for each template. I don't see this mentioned in the wiki page [1] but I think this should be wrote down somewhere. Cheers Andrea [1] 2013/3/5 Dimitris Kontokostas < > uOn Wed, Mar 13, 2013 at 1:49 PM, Andrea Di Menna < > wrote: Thanks, I added it to the page. JC uHi Riko, If you need them soon it would speed things up if you generated them [1] on your own and send them to me, otherwise I'll try to generate them sometime next week. Best, Dimitris [1] On Tue, Apr 9, 2013 at 7:23 AM, Riko Adi Prasetya < >wrote: uDear All, Is there any Ontology Infobox Types in ROMAN URDU language? Second thing, is there any region wise ontology e.g Continent specific OR country specific?? On Tue, Apr 9, 2013 at 10:47 AM, Dimitris Kontokostas < >wrote: uThanks Riko, It's online: Best, Dimitris On Tue, Apr 9, 2013 at 10:16 AM, Riko Adi Prasetya < >wrote:" "DBpedia Data Quality Evaluation Campaign" "uDear all, As we all know, DBpedia is an important dataset in Linked Data as it is not only connected to and from numerous other datasets, but it also is relied upon for useful information. However, quality problems are inherent in DBpedia be it in terms of incorrectly extracted values or datatype problems since it contains information extracted from crowd-sourced content. However, not all the data quality problems are automatically detectable. Thus, we aim at crowd-sourcing the quality assessment of the dataset. In order to perform this assessment, we have developed a tool whereby a user can evaluate a random resource by analyzing each triple individually and store the results. Therefore, we would like to request you to help us by using the tool and evaluating a minimum of 3 resources. Here is the link to the tool: details on how to use it. In order to thank you for your contributions, a lucky winner will win either a Samsung Galaxy Tab 2 or an Amazon voucher worth 300 Euro. So, go ahead, start evaluating now !! Deadline for submitting your evaluations is 9th December, 2012. If you have any questions or comments, please do not hesitate to contact us at Thank you very much for your time. Regards, DBpedia Data Quality Evaluation Team. dbpedia-data-quality uOn 11/15/12 11:58 AM, wrote: uOn Thu, Nov 15, 2012 at 7:05 PM, Kingsley Idehen < >wrote: uOn 11/15/12 11:58 AM, wrote: uHi everyone, Would it be possible to edit already evaluated resources? I'm not confident of what I have just submitted. Cheers, Marco On 11/15/12 12:27 PM, Kingsley Idehen wrote: uHi Marco, Unfortunately this option is not available in this version of the tool. Cheers, Dimitris On Thu, Nov 15, 2012 at 7:31 PM, Marco Fossati < > wrote: u0€ *†H†÷  €0€1 0 + uOn Thu, Nov 15, 2012 at 7:44 PM, Kingsley Idehen < >wrote: uAm 15.11.2012 19:12, schrieb Giovanni Tummarello: Its not about factual correctness, but about correct extraction and representation. If Wikipedia contains false information DBpedia will too, so we can not change this (at that point). What we want to improve, however, is the quality of the extraction. Best, Sören uI'd be pretty skeptical that the error rate for unpaid evaluators would be less than the error rate in the data itself. Are you making it clear to people what the standard of performance is? Are we supposed to check stuff against a human reading of Wikipedia or actually verify the facts? When I see data quality problems in Freebase or DBpedia they often involve global properties that aren't detectable at the level of individual nodes. For instance, there are the two great trees of living things and geographical containment. Often these have obscure breakages at high level nodes that will break any algorithm that assumes these things are trees. And it generally turns out that things are sketchy at certain high level nodes where some taxonomists introduce levels of classification that others don't and don't get me started on those anglophone islands on the other side of the English channel. In cases like that you can't count on getting accurate answers from average people and your odds aren't even that good if you ask an expert. Certainly there is a lot of noise in the category assignments in Wikipedia. It might be reasonable to expect people to flag incorrect category assignments but without some global view, finding the ones that are missing (maybe 40% of them in some cases) is too much to ask. uAm 15.11.2012 19:44, schrieb Giovanni Tummarello: or issues. Once we reduced the number of problems significantly you are perfectly right and we should look for \"bad smells\" Best, Sören uDear Matthew, Thanks for your suggestions. On 15 Nov 2012, at 19:59, Matthew Gamble wrote: In order to help users understand the errors, we have provided an example and description for each of them. We have provided a link to the corresponding Wikipedia page where the user can look at the original data and compare it with the extracted data in DBpedia. Yes, that's a good idea. However, in the current version of the tool it only displays what a user would see while browsing DBpedia. Each triple is in fact separated out so that a user can only choose and specify individually which triple has a data quality problem. Also, it is important to know which resource the triple belongs to, in order to evaluate it's quality. uHi, In case you missed out on the following email, here's a chance to win either a Samsung Galaxy Tab 2 or an Amazon voucher worth 300 Euro !!! Please help us in evaluating the quality of DBpedia by using the tool: Thank you very much for your time. Regards, DBpedia Data Quality Evaluation Team. On 15 Nov 2012, at 17:58, wrote:" "DBO ontology problems" "uFollowing the announcement of Dimitris Kontokostas about migration to WebProtege, I thought I'd share some observations about the ontology itself. I can add more details if there's interest. I see these main problems with the ontology: 1. many classes and properties need better description 2. some properties are redundant or ambiguous. This is exacerbated by the lack of descriptions, e.g. it takes data examination to figure out that \"event\" means the same as \"sportDiscipline\" and should be eliminated. 3. a few \"syntax errors\" re classes and capitalization of properties, I think I wrote about those (DID I?) E.g. firstAccentYear rdfs:domain Peak,Volcano is not properly parsed (nor are multiple domains a good idea). 4. Mapping to external ontologies, esp. schema.org is not good. I also consider statements about owl:Thing to be useless. I'll write separately about this, but what's the chance the DBpedia maintainers will agree to emit separately: - the ontology mapping statements (filtering by namespace). I have a simple example. - data triples mapped to external ontologies 5. Specific properties (e.g. Person/height 6. Since domain/range are not taken into account by the mapped extractor, I think it's better to emit them as schema:domainIncludes, schema:rangeIncludes rather than rdfs:domain, rdfs:range (which have strict uncompromising semantics) 7. \"topical pages\" should use foaf:focus not skos:subject, I wrote about this The migration to WebProtege should take care of 3. I've played with WebProtege once but haven't used it. I hope it has as good collaboration features as MediaWiki, in particular subscribing to a class or discussions about it. If not: maybe we can have discussions in the Class/Property Discussion namespaces (which currently nobody uses), even though the definitions are in WebProtege. Cheers! u(fingers faster than the mind error) a. They provide more \"natural\" units for that specific measurement. E.g. I could look for tall people like this: ?x dbo:Person/height > 180 b. But I'm not sure it's a good idea to have them: I have to know there's such a property, and to check what is the unit (I bet that is not documented). It's just as easy to write ?x dbo:height > 1.80 c. Furthermore, a) is not correct SPARQL since one can't have a slash in a prefixed name. So I propose to rename them to eg dbo:Person_height, etc Cheers!" "How to turn a local chapter live ?" "uHi, I've recently configured a live endpoint for the German chapter, but I don't really know which namespace to use. I'm currently using a Live uses the live.dbpedia.org namespace, and the Dutch chapter scrapped the dumps and make the live extraction their main endpoint. Maybe we should have a naming convention regarding the way the local chapters name the live endpoints, maybe or do we just scrap the dumps . Also the way the resources in the live endpoints are named needs to be considered, do we keep the else ? Another problem is dbpedia_dav.vad, it has issues in dealing with and endpoint that is in a subfolder of a domian (/live), I'm currently hacking at it but it's not going to be pretty. For the local chapter admins that are interested in hosting their own endpoint, I'll post a tutorial and a preconfigured VM in the next days. However, first you need to request access to the wikipedia OAI-PMH proxy from the main chapter. Cheers, Alexandru Alexandru Todor Freie Universität Berlin Department of Mathematics and Computer Science Institute of Computer Science Königin-Luise-Str. 24/26, room 116 14195 Berlin Germany uHi Alexandru, +1 for the tutorial and the VM, very much appreciated. I think the Not sure how to request access to the wikipedia OAI-PMH proxy from the main chapter tough. Cheers, Marco On 2/4/14, 2:38 AM, Alexandru Todor wrote:" "selecting data for a given resource / page" "uWhat query can I use to get the data (all) for a given resource / page such as: See how Windows connects the people, information, and fun that are part of your life. { margin:0px; padding:0px } body.hmmessage { FONT-SIZE: 10pt; FONT-FAMILY:Tahoma } What query can I use to get the data (all) for a given resource / page such as: Now uI believe this would be the DESCRIBE command of SPARQL: In the case of Tom Cruise, i thought this would look something like this: DESCRIBE Or: DESCRIBE ?resource WHERE {?resource foaf:page \" However, this doesnt seem to work for the DBpedia endpoint. The query doesnt finish. But in one of the examples for the snorql interface, this has been done instead: SELECT ?property ?hasValue ?isValueOf WHERE { { ?property ?hasValue } UNION { ?isValueOf ?property } } Link: This should work similar to what i expected DESCRIBE would do. Hope this helps. Marian Am 04.10.2008 um 20:07 schrieb < > < >: uI believe this would be the DESCRIBE command of SPARQL: In the case of Tom Cruise, i thought this would look something like this: DESCRIBE Or: DESCRIBE ?resource WHERE {?resource foaf:page \" However, this doesnt seem to work for the DBpedia endpoint. The query doesnt finish. But in one of the examples for the snorql interface, this has been done instead: SELECT ?property ?hasValue ?isValueOf WHERE { { ?property ?hasValue } UNION { ?isValueOf ?property } } Link: This should work similar to what i expected DESCRIBE would do. Hope this helps. Marian Am 04.10.2008 um 20:07 schrieb < > < >: uHi there, Not quite right. That URI identifies the Wikipedia article about Tom Cruise. DBpedia has a own URI for identifying the \"concept\" Tom Cruise, i.e. the person himself, with all the data about him. That URI is So your query to get all data about Tom Cruise is DESCRIBE Cheers, Georgi uA side note: SPARQL DESCRIBE can in theory return any data about a concept / URI that the data publisher thinks is appropriate. All implementations of SPARQL DESCRIBE I know of do in fact return all triples directly associated with a URI. So it has, by implementation, the same result as SELECT * where {?s ?p . } UNION SELECT * where { ?p ?o . } But that's only by implementation, not necessary by design. Best, Georgi uBut this doesnt seem to work: Do you know why? Marian Am 05.10.2008 um 05:08 schrieb Georgi Kobilarov: uOn 5 Oct 2008, at 19:39, Marian Dörk wrote: The Snorql interface can only deal with SELECT queries, it doesn't work for DESCRIBE or CONSTRUCT. Best, Richard uMarian Dörk wrote: uThanks, Richard and Kingsley, for clarifying this. Marian" "Introducing Sztakipedia" "uHello, We have made an Intelligent Assistant for Wiki which puts dbpedia in good use, you might be interested in: I wanted to share this on this list for several reasons: 1) I wanted to say thank you for everyone who works on dbpedia, I think this is a great achievement. 2) Right now Sztakipedia is branded as an \"Intelligent Assistant\" which helps you in the boring work of finding links, references, infoboxes, categories etc., while creating a wiki article. But it has been designed as a two way tool from the very beginning - what I mean by that is that we could have the users to help improving dbpedia data only in some a very-nonobtrusive way of course. 3) I am interested in your thoughts and remarks in general - you surely have good ideas about what could be done with this agent in the editor! 4) And finally, the most important thing : recently I was asked to write a book chapter about the ways of using dbpedia data in mashups. Naturally it is my task to do the research and compile a good overview on how dbpedia is used in the wild as part of web interfaces. I am also familiar with the many white papers on this topic. But I still wanted to ask from everyone on this list: What are your favorite applications of dbpedia? In your opinion, what should I emphasize? Thanks you! Best Regards Mihály Héder Computer and Automation Research Institute Budapest, Hungary uHI! Great tool for MediaWiki guys like me! Do you have these tool available for download? And second question, will it work for non-English language? Yury On Wed, Oct 5, 2011 at 6:48 PM, Mihály Héder < > wrote: uSzia Mihály, This is truly awesome! You have read my mind. Please take a look at these two related ideas below. Human-powered data fusion: round trip (in/ex)ternal data reuse in Wikipedia One hands washes the other:round trip semantics with SMW and DBpedia Spotlight We should talk about how to integrate Sztakipedia with DBpedia Spotlight (for the inlink suggestion) and Semantic MediaWiki (for relationship suggestion). We could use your client also to collect user feedback and learn from our mistakes. I can see very interesting results coming out of this. What do you think? Best, Pablo On Wed, Oct 5, 2011 at 4:48 PM, Mihály Héder < > wrote: uHello! 1) The toolbar is really a MediaWiki user script (javascipt), not a browser extension or something, and you can enable in your account right now. Check pedia.sztaki.hu \"Enable it in your account\". It communicates with a server endpoint which is provided by us and is totally public and free (but is in beta! Could not test it with crowd load yet). Behind that endpoint there are a couple of servers: UIMA, Solr(Lucene) and other stuff. That stuff is not a beast you just download and install, but you don't need it anyway. 2) Well, your second question is a harder one. What I can promise that we will come up with some general version you can use but with less functionality. -the categorization relies on Yahoo search. As long as yahoo indexes the Wiki of your preferred language we can make it work. (A long-term issue is that we have to pay a small amount for it - some 4$ / 10K search - I will try to find someone at yahoo and ask for their support for instance in exhange for putting their logo in the suggestion window. But right now I don't even have a contact to them.) -Link recommendation relies on tf-idf data and the dbpedia data. To do the tf-idf calculation we need the xml dump of a certain wiki and run some scripts. It takes about a week in case of the english wiki, others are of course much smaller BUT we need some kind of stemmer or lemmatizer to the given language - preferably one which we can integrate with UIMA. We already have integrated snowball, so in theory we are able to process any language snowball supports ( don't do the stemming, in theory tf-idf can still work but problems arise with languages like Hungarian - where we concatenate funky suffixes to the words to signal past tense, posessive, modalities, etc>From dbpedia we use the list of pages so its not optional. -Infobox recommendation is similar - it relies on the XML dump, and the corresponding dbpedia infobox data. If we have those we can start a kind of machine learning (actually done by lucene). To be able to display the infobox fill form with help, we also need certain xml files for infoboxes. -there is a co-occurence learning phase, it relies on XML dump and tf-idf, and is needed for book recommendation. But book search works without that. -Book recommendation is quite simple - you can use the english books, which are often referenced in non-english texts as well. To have non-english books we need library catalogs in some processable format. That can be an issue, I have not even found one for Hungarian yet. However, we could change this part and use library API's like Z39.50. There are always performance issues with those but I can see that sooner or later we need to support those. So to sum up, adding a new language is a piece of work right now and we need certain resources. However, we will try German and Hungarian in this year. We will try to simplify the process and will do our best in supporting more languages. But we always gonna need some help form locals to the given country - library data and testing. I hope I answered your questions and you will become a happy user! Best Regards Mihály On 5 October 2011 16:56, Yury Katkov < > wrote: uOlá Pablo! I'm happy that we are thinking along the same lines! Wow, I have some faint memory that I have heard about spotlight but I did not know that it is in this advanced phase! I checked the demo and I must admit that it is far more advanced than our inlink suggestion stuff is now. The logical step is that we integrate it with Sztakipedia. I will check tha API and come up with a plan by the end of next week. The questions I see now are 1) What is the max text length one can provide? Right now we can send the whole edited article part only (that is an issue we should work on) and that can be quite long sometimes 2) What are the languages spotlight supports? 3) In case of wikipedia, we should link to articles in wikipedia not dbpedia. But if I'm correct I can deduce programmatically the correct string from the dbpedia uri so its just a note really. About SMW integration: in theory SzP should work with any kind of mediawiki with the vector skin - and also is a good vehicle to send any kind of suggestions and wikitext parts to the editor. So it should work. What I'm not familiar is how relation suggestion is done. Is it similar to the spotlight solution? About collecting feedback: absolutely. The only issue now is that we are not identifying the users in any way which should be changed to get sensible data. All the best Mihály On 5 October 2011 17:22, Pablo Mendes < > wrote: uFantastic! Looking forward to your suggestions. 1) What is the max text length one can provide? Intentionally there should be no limit. But it seems that in practice, the system starts to choke at about 450K. I am looking at the bottlenecks. 2) What are the languages spotlight supports? Currently English is the only completed version. Portuguese, Spanish and German have started. Any Wikipedia/DBpedia language can be used. Like you pointed out, stemmers will improve results, and so will POS tagging. The more language-specific knowledge you add, the better the results. But it also works reasonably with the simpler (language-independent) solution. 3) we should link to articles in wikipedia not dbpedia. Exactly. One-to-one mapping. What I'm not familiar is how relation suggestion is done. Is it similar to Yes and no. I have some ideas for the relation suggestion, but I have not implemented yet. We can discuss them offline. not identifying the users in any way which should be changed Looking forward to that, then. I am assuming that making an entry to the page history will do? Like \"Suggested: Berlin, Germany; User Selected: Berlin, Texas\" I will check tha API and come up with a plan by the end of next week. Great! Looking forward to that. We can continue the discussion offline or at Also, I will be in Budapest in November. So maybe we can grab a kávé or a sör. Cheers, Pablo On Thu, Oct 6, 2011 at 12:20 PM, Mihály Héder < >wrote: uThank you very much indeed for such a comprehensive answer! And the planned integration with Semantic MediaWiki is going to be really awesome! Yury On Thu, Oct 6, 2011 at 1:45 PM, Mihály Héder < > wrote:" "Virtuoso 22023 Error SR008" "uI was querying DBpedia, when I started to receive HTTP 500 errors, the response says: Virtuoso 22023 Error SR008: Function id_to_iri needs an IRI_ID or NULL as argument 1, not an arg of type INTEGER (189) SPARQL query First I thought my query became broken, but I noticed I get the response to the example queries as well: Example 1 Example 2 Is this a temporally server error? uOn 6/27/12 1:12 PM, Konrad Reiche wrote: There something amiss. See the URL below, its a raw SPARQL URL (not snorql) with the hostname changed from dbpedia.org to lod.openlinksw.com. See: . uHi Konrad, Yes, it looks like on of the nodes did not properly initialize. I restarted the node and checked your samples which now all work again. Patrick" "Using Dbpedia Lookup." "uHi! I'm new to the Web Semantic technologies and wanted to use the Dbpedia Lookup Web Service with JAVA. Searching around the Internet, I found a code that sends a word to Lookup, and gets an XML returned. It uses SAX to get the *label*,* abstract* and *URI* from the XML. The results isn't precise though, sometimes it gets the wrong properties. Someone told me this isn't the correct way to use Lookup. The code is bellow, only for you to know it, but what's the correct way to use Lookup? public class DbpediaLookupClient extends DefaultHandler { public DbpediaLookupClient(String query, String queryClass) throws Exception { this.query = query; HttpClient client = new HttpClient(); HttpMethod method; String query2 = query.replaceAll(\" \", \"+\"); if( queryClass.equalsIgnoreCase(\"Person\") ) { method = new GetMethod(\" \"MaxHits=5&QueryClass;=\" + queryClass + \"&QueryString;=\" + query2); } else { method = new GetMethod(\" \"MaxHits=5&QueryString;=\" + query2); } try { client.executeMethod(method); System.out.println(method); InputStream ins = method.getResponseBodyAsStream(); //System.out.println(method.getResponseBodyAsString()); SAXParserFactory factory = SAXParserFactory.newInstance(); SAXParser sax = factory.newSAXParser(); sax.parse(ins, this); } catch (HttpException he) { System.err.println(\"Http error connecting to lookup.dbpedia.org\"); } catch (IOException ioe) { System.err.println(\"Unable to connect to lookup.dbpedia.org\"); } method.releaseConnection(); } private List > variableBindings = new ArrayList >(); private Map tempBinding = null; private String lastElementName = null; public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { if (qName.equalsIgnoreCase(\"result\")) { tempBinding = new HashMap (); } lastElementName = qName; } public void endElement(String uri, String localName, String qName) throws SAXException { if (qName.equalsIgnoreCase(\"result\")) { if (!variableBindings.contains(tempBinding) && containsSearchTerms(tempBinding)) variableBindings.add(tempBinding); } } public void characters(char[] ch, int start, int length) throws SAXException { String s = new String(ch, start, length).trim(); if (s.length() > 0) { if (\"Description\".equals(lastElementName)) { if (tempBinding.get(\"Description\") == null) { tempBinding.put(\"Description\", s); } tempBinding.put(\"Description\", \"\" + tempBinding.get(\"Description\") + \" \" + s); } if (\"URI\".equals(lastElementName) && s.indexOf(\"Category\")== -1 && tempBinding.get(\"URI\") == null) { tempBinding.put(\"URI\", s); } if (\"Label\".equals(lastElementName) && tempBinding.get(\"Label\") == null) tempBinding.put(\"Label\", s); } } public List > variableBindings() { return variableBindings; } private boolean containsSearchTerms(Map bindings) { StringBuilder sb = new StringBuilder(); for (String value : bindings.values()) sb.append(value); // do not need white space String text = sb.toString().toLowerCase(); StringTokenizer st = new StringTokenizer(this.query); while (st.hasMoreTokens()) { if (text.indexOf(st.nextToken().toLowerCase()) == -1) { return false; } } return true; } private String query = \"\"; } Hi! I'm new to the Web Semantic technologies and wanted to use the Dbpedia Lookup Web Service with JAVA. Searching around the Internet, I found a code that sends a word to Lookup, and gets an XML returned. It uses SAX to get the label , abstract and URI from the XML. The results isn't precise though, sometimes it gets the wrong properties. Someone told me this isn't the correct way to use Lookup. The code is bellow, only for you to know it, but what's the correct way to use Lookup? public class DbpediaLookupClient extends DefaultHandler { public DbpediaLookupClient(String query, String queryClass) throws Exception { this.query = query; HttpClient client = new HttpClient(); HttpMethod method; String query2 = query.replaceAll(' ', '+'); if( queryClass.equalsIgnoreCase('Person') ) { method = new GetMethod(' }" "Dbpedia and Freebase" "uHi everyone! Clearly Dbpedia and Freebase data complement each other: in Freebase the data comes from various sources and it is much cleaner; on the other hand data in Dbpedia is related to more external sources and binded to YAGO, Dbpedia and many more ontologies. I have two questions related to this: 1) I'm thinking of the tutorial on how to use the couple Dbpedia/Freebase in practical projects. Is there any other simular tutorial out there, would it be useful in your opinion? 2) can we generate new mappings or improve the extraction scripts by analysing and parsing Freebase data? Is that a good idea or is there some kind of redundancy there? Is it legal with respect to Google's licences? Sincerely yours, Yury V. Katkov WikiVote! llc Hi everyone! Clearly Dbpedia and Freebase data complement each other: in Freebase the data comes from various sources and it is much cleaner; on the other hand data in Dbpedia is related to more external sources and binded to YAGO, Dbpedia and many more ontologies. I have two questions related to this: 1) I'm thinking of the tutorial on how to use the couple Dbpedia/Freebase in practical projects. Is there any other simular tutorial out there, would it be useful in your opinion? 2) can we generate new mappings or improve the extraction scripts by analysing and parsing Freebase data? Is that a good idea or is there some kind of redundancy there? Is it legal with respect to Google's licences? Sincerely yours, Yury V. Katkov WikiVote! llc uOn 11/22/2011 6:06 AM, Yury Katkov wrote: What people need here is a product, not a tutorial. The factforge query editor does a \"reasonable\" job of combining data from DBpedia and Freebase and making them queryable through SPARQL. Like most similar things, it doesn't provide a consistent point-of-view. If you look at the sample queries at the bottom you often see them using multiple predicates to get information that came in through different sources. owl:sameAs gets used to glom together Freebase and DBpedia concepts; I don't know where they got their owl:sameAs statements, but I know that if you use the ones that come with Wikipedia you'll find some concepts end up getting lost or confused in big Katamari balls. If you want to write simple queries and get consistently good results you need some process where you clean the data up, decide what you believe when there is contradictory information and all of that. It would really be great if we had some system that could represent that \"john thinks that mary said Lady Gaga is a man\" but the strategy of throwing it all in a triple store and hope people aren't going to notice they're getting bad results doesn't cut it. I think DBpedia has a different philosophy than Freebase. DBpedia combines a Wikipedia dump with a rulebox that creates a set of triples. The rulebox isn't capable of doing deep cleanup on data. The \"correct\" way to fix something in DBpedia is to fix the data in Wikipedia. If I've got an automated process that say, reconciles places with Geonames and inserts geographic coordinates for 200,000 places, the only way I can get this data is get it into Wikipedia, and that's a very difficult proposition because there are many different infobox templates for different kinds of locations in which coordinates are represented differently. On the other hand, if you want to bulk insert data into Freebase, or fix a fact that's wrong, it's pretty easy to do. DBpedia might look like a database about \"topics\", but it's really a database about Wikipedia pages" "SSL certificate for dbpedia.org" "uDear all, Could we get an SSL certificate for dbpedia.org? For several of my sites, I obtained a free certificate from First, a certificate is of course a precondition for HTTPS, which would be useful for privacy reasons ( Second, since the Web is increasingly switching to HTTPS, this would also allow HTTPS sites to do live things with DBpedia, which is currently impossible. Third, it would also allow HTTP/2, which is particularly useful for Linked Data querying purposes, as this speeds up a lot of successive requests. When a certificate is available, we would immediately make fragments.dbpedia.org also available as HTTPS and HTTP/2. Thanks in advance, Ruben [1]" "Better dbpedia <-> fbase mappings (alpha test!)" "uOk, these dbpedia <-> fbase mappings are the ones I use, just serialized to NT format I think these ought to be a drop-in replacement for the \"freebase_links.nt.bz2\" file. Since it is derived from dbpedia data, the license on this is CC-BY-SA. I'd like it very much if some of you could try to load it into your RDF stores and make sure that it isn't ill-formed. Once I've gotten some feedback I'll put up a little web page about it, including information about how it was derived. There's some other nice stuff I've got that you may be interested in as well, for instance, rdf:type relationships between dbpedia resources and freebase types, and once I've documented that file I'll see what I can spin." "RDFEasy: scalable Freebase SPARQL queries in the AWS cloud" "uhe first RDFEasy product is ready. RDFEasy BaseKB Gold Compact Edition is a SPARQL database based on the Compact Edition of :BaseKB which is a good set of facts to work from if you are interested in Freebase. You can experience it in the most popular cloud environment by going to and making a single click. Once the instance is provisioned you can follow the instructions here to log in and start making queries Hardware and software inclusive, this product costs 0.45 cents an hour with the default configuration, which can host the data set on an internal SSD and handle the bruising query workloads associated with knowledge base development. Thus, anyone who wants to try powerful SPARQL 1.1 queries with Virtuoso 7 against Freebase data can get started with very little time and money. Some of you will want the \"whole enchalada\" and that is in the pipeline too, a complete copy of :BaseKB including text descriptions and notability information." "See wikipedia updates" "uI wonder if anyone in this list is aware of a way to see what Wikipedia articles have been updated \"recently\" (say during the last 24 hours or so). I'm aware that Wikipedia provides data dumps here: However, those are from October of last year and I want to download fresher data, and instead of doing 3-4 million HTTP request and download each individual page manually from Wikipedia every day, I thought perhaps there is a way to find out what articles have recently changes so one could keep a constantly fresh copy of Wikipedia locally without re-downloading pages that have not been updated. I'm not sure if this is possible, but someone here might be aware of some kind of feed like this just giving me a list of updated Wikipedia pages. So to clarify, what I'm looking for is to have access to fresh Wikipedia data locally (say 24 hours or 7 days old at most), without putting a great load on Wikipedia servers (to not download every page every day). Thanks /Omid uYes they provide realtime IRC feeds. On Tue, Jan 27, 2009 at 8:36 PM, Omid Rouhani < >wrote:" "Wikinomics: How Mass Collaboration Changes Everything" "uHi, there is a new book that might be interesting: Wikinomics: How Mass Collaboration Changes Everything Did anybody allready read it and can say if it is good? Cheers Chris" "queries with large number of answers" "uHello, I am currently working on an application which is based on querying dbpedia. However I think that the number of the query results given has a certain limit. And if the limit is exceeded, the remaining answers are omitted. Is there any possible way to have access tothe whole result set without having to download the dbpedia datasets? In other words, is there any api which gives the opportunity to query dbpedia via sparql and get the correct number of answers no matter how large it is? Thank you for your time! uHello, Try searching for SPARQL LIMIT and OFFSET. Cheers Pablo On Oct 28, 2012 11:51 AM, \"aliki aliki\" < > wrote: Hello, Try searching for SPARQL LIMIT and OFFSET. Cheers Pablo On Oct 28, 2012 11:51 AM, 'aliki aliki' < > wrote: uHi, in case your answer sets get really large, you might want to have a look at this page: Best, Heiko Am 28.10.2012 22:50, schrieb aliki aliki:" "No page links in Live?" "uI just did $ bzcat ~/dbpedia_2012_05_31.nt.bz2 | grep 'wikiPageWikiLink' | wc and got back zero lines. Is it deliberate that ?s ?o . triples are missing from Live? If that's so, that's very disappointing. :wikiPageWikiLink is one of the most useful predicates in DBpedia, particularly from the viewpoint of one who's merging DBpedia with some other KB and wants something to fill the gaps in other KBs. It's difficult to come up with a better subjective importance score than select count(*) { ?s :wikiPageWikiLink dbpedia:J._Edgar_Hoover . } I use Hoover as an example, because DBpedia, Freebase and other popular knowledge bases don't do a good job of documenting his career. Perhaps you could define events for cops (?s :arrested ?o) and bureaucrats (?) but then you've got to populate them. Hoover, however, used his power to protect his privacy, and a real accounting of his notorious career would be impossible. A historical figure as important as Hoover collects relationships with many other entities, and even if they aren't precisely ontologized, they are captured in the page links. This gives him a fighting chance to rank higher than the Foo Fighters or some other entity that secretes large numbers of well-formed and ill-formed bibliographic entries. Adding these links to the public endpoints and making them available in Live would be a big boost for DBpedia users. uHi Paul, On 07/01/2012 05:25 PM, Paul A. Houle wrote: Actually, the \"PageLinksExtractor\" is inactive, so its triples are not there. The number of triples generated by that extractor is extremely large, and if we store all those triples in our DBpedia-Live instance [1], definitely the performance will degrade dramatically. If you are just interested in the number of page links, then we can add a property, such as numberOfWikiPageWikiLinks or so, that will hold that piece of data. [1]" "Semantic Web Journal Special Issue on Quality Management of Semantic Web Assets (Data, Services and Systems)" "uThe standardization and adoption of Semantic Web technologies has resulted in a variety of assets, including an unprecedented volume of data being semantically enriched and systems and services, which consume or publish this data. Although gathering, processing and publishing data is a step towards further adoption of Semantic Web, quality does not yet play a central role in these assets (e.g., data lifecycle, system/service development). Quality management essentially refers to activities and tasks involved to guarantee a certain level of consistency and to meet the quality requirements for the assets. In general, quality management consists of the following four phases and components: (i) quality planning, (ii) quality control, (iii) quality assurance and (iv) quality improvement. The quality planning phase in the Semantic Web typically involves the design of procedures, strategies and policies to support the management of the assets. The quality control and assurance components have their primary aim in preventing errors and to meet quality requirements pertaining to the Semantic Web standards. A core part for both components are quality assessment methods which provide the necessary input for the controlling and assurance tasks. Quality assessment of Semantic Web Assets (data, services and systems), in particular, presents new challenges that were not handled before in other research areas. Thus, adopting existing approaches for data quality assessment is not a straightforward solution. These challenges are related to the openness of the Semantic Web, the diversity of the information and the unbounded, dynamic set of autonomous data sources, publishers and consumers (legal and software agents). Additionally, detecting the quality of available data sources and making the information explicit is yet another challenge. Moreover, noise in one data set, or missing links between different data sets, propagates throughout the Web of Data, and imposes great challenges on the data value chain. In case of systems and services, different implementations follow the specifications for RDF and SPARQL to varying extents, or even propose and offer new, non-standardized extensions. This causes strong incompatibilities between systems, e.g., between the used SPARQL features in the query engines and support features in RDF stores. The potential heterogeneity and incompatibility poses several challenges for the quality assessments in and for such systems and services. Eventually, quality improvement methods are used to further enhance the value of the Semantic Web Assets. One important step to improve the quality of data is identifying the root cause of the problem and then designing corresponding data improvement solutions. These solutions select the most effective and efficient strategies and related set of techniques and tools to improve quality. Quality improvement metrics for products and services entails understanding and improving operational processes and establishing valid and reliable service performance measures. This Special Issue is addressed to those members of the community interested in providing novel methodologies or frameworks in managing, assessing, monitoring, maintaining and improving the quality of the Semantic Web data, services and systems and also introduce tools and user interfaces which can effectively assist in this management. Topics of Interest We welcome original high quality submissions on (but are not restricted to) the following topics: - Methodologies and frameworks to plan, control, assure or improve the quality of Semantic Web Assets - Quality exploration and analysis interfaces - Quality monitoring - Developing, deploying and managing quality service ecosystems - Assessing the quality evolution of Semantic Web Assets - Large-scale quality assessment of structured datasets - Crowdsourcing data quality assessment - Quality assessment leveraging background knowledge - Use-case driven quality management - Evaluation of trustworthiness of data - Web Data and LOD quality benchmarks - Data Quality improvement methods and frameworks e.g., linkage, alignment, cleaning, enrichment, correctness - Service/system quality improvement methods and frameworks" "Hello - I hope I'm in the right area and sorry, if I'm not" "uHello, I hope I'm in the correct area, but if not, sorry (and hopefully someone can direct me). I'm active on a WIKI for \"Final Fantasy XI.\" It is at: In the associated forums, I posted an idea for an application, which could use information found in the wiki. This is information is in different area (it deals with monster levels, locations of monsters, as well as, combat skill levels of the various jobs). The thread is at: NOTE: Presently, the monsters and locations have their own individual pages, as well as category pages (I've noted sometime they won't agree on all data). These pages also follow a standard template (well, are supposed to follow). One person suggested there might be programs, which have the ability to extract information from WIKI pages for applications to use. My goal would be two fold (one beyond the thread) 1) Meet the goals of a reverse skill up application, as outlined in the thread. 2) Create a page for the wiki - wiki.ffxiclopedia.org - listing all the monsters known by the wiki and their attributes. To keep this page current / in sync with individual monster pages, well, this application would exist. So, I'm wondering is this the place to find an application / ability to scan and collect information for the FFXIclopedia? If no, ah, is there somewhere to go to obtain this help? If yes, great, where do I go to learn and/or get help? NOTE: One short coming to the WIKI concept I see is keeping stuff in sync. In \"our\" case an example is monster level. Say the individual monster page says the level is 83 - 85, YET the category page for the monster's type can list the same monster but give the level as 84 - 86. I understand why this happens, as each page is independent but yet refers to the same information. What would be nice is a way / method to keep the same data in sync across a wiki (or multiple wikia). For example: If one could save a monster and its attributes in a database and then in the page reference the entry (e.g. monster_name.level (e.g. Goblin_Beastman.level) would generate the 83 - 85 and thus everywhere the monster and its level would be in sync). Anyway, thanks for listening and I look forward to any / all positive help I obtain from the list. Thanks!!! Bert. uHi Bert, I think you're in the right place but I don't know how familiar you are with Linked Data and DBpedia. You could read the following papers first to understand how things work and then we could discuss it further There was a recent simlar approach to extract Wiktionary, so this could be a base for your project And a few extra links ;) Cheers, Dimitris On Sat, Dec 8, 2012 at 1:40 AM, I. B. Halliwell < >wrote:" "Portuguese extraction (messed up abstracts)" "uHi DBpedia people, I've been using the latest (3.7) Portuguese dumps to build a custom dbpedia index for Stanbol's embedded SOLR ( I really don't know if this is the right place to say this, basically I've been messing with the generated index and I've found a number of cases where the abstract/comment in PT is all messed up, for example: So tracking these back to Wikipedia, I can see that in these cases there is no Infobox/wiki markup, instead there is LOTS of HTML that results in something similar being rendered. However when extracted this makes the comments/abstracts full of HTML markup :( So I hope this helps identify weak spots in Portuguese dbpedia/wikipedia or at least can be redirected by someone to someone that cares (Pablo Mendes?). At the moment I don't know if there are more cases like this or how many of the articles are like this, I'll go on testing and report back. Best, Alex uAlex, Thanks for your report. I think this is the right place as it may concern more than one localized DBpedia. If you something that is specific to the PT extraction, please feel free to also discuss at I have added a ticket and will try to find a volunteer to look at this. Best, Pablo On Tue, Nov 15, 2011 at 5:07 PM, Alex Lopez < > wrote:" "Possible bug with dbpedia Drug UNIIs" "uHello, There is a property called \"dbpprop:unii\" which I believe should provide an FDA Unique Ingredient Identifier (UNII) code for a dbpedia drug. Unfortunately, it appears that a great number of the codes are incorrect. For example, the following query shows numerous repeated \"0\" values as the uniis for various drugs. If you go back to the FDA's official UNII listing and grep it for the drug names (upper cased), it turns out that all of the drugs in the left column have UNIIs starting with the integer shown in the right column! This is not true for all drugs; this other query returns a correct UNII . This looks like it might be a bug. Does anyone know how this issue can be addressed? kind regards, -Rich uHi Richard, It seems that the URL broke when you shortened it: But in general if there is a 0 value for dbprop:unii it is *probably* because the value was parsed wrongly from the infobox. I looked at: And I can see this: | UNII = 0CPP32S55X Perhaps the parser though that anything that starts with 0 is a number. This might be solvable with a mapping to inform the parser that the value is a string. Have you had a look at the mappings for drugs? e.g. This mapping connects the infobox Drugbox [1] to our ontology. It seems that UNII was not yet mapped to the ontology. I've added a mapping and a property. It seems to work: I hope that data will show up in DBpedia Live soon. Perhaps somebody from your group wants to help with more mappings? [1] [2] Cheers, Pablo On Fri, Jul 20, 2012 at 12:31 PM, Richard Boyce < > wrote: uHi Richard, Please check if you now have editor rights. Otherwise, please send us your username. Unfortunately my mapping is not showing up at DBpedia Live yet. This makes me thing that this is some kind of incompatibility between the live and the new extraction code. Mohamed, JC, can you guys confirm this please? in the meantime, here are a stab at a couple of mappings that might be Great! Thanks. They will definitely be helpful. But unfortunately the mapping system might not be smart enough to handle what you have input there. You would probably have to split that content into a few entries. The rdfs:label, rdfs:domain, rdfs:range go to ontology pages [1], while templateProperty and ontologyProperty go to Mapping pages [2]. Please let me know if the difference is not easy to spot by looking at these two examples: [1] [2] The way the system is set up now, I think you would have to create one page for each ontology property (you can copy+paste stuff from [1] and modify accordingly), then find the Infobox where the data appears and add the mapping from that infobox to the ontology property (just like we did in [2]). Perhaps these links will be helpful: Cheers, Pablo On Sat, Jul 21, 2012 at 1:24 PM, Richard Boyce < > wrote: uHi Pablo, Mohamed, and JC, 1) I added two drugbox mappings (bioavailability and ChEBI) per the example - would you please confirm that these are acceptable? 2) Still no luck with querying FDA UNIIs in live.dbpedia.org: uHi Richard, The property mappings in Drugbox look great! Thanks. I am not sure what is the correct usage of owl:equivalentProperty, though: I think that usually we \"hardcode\" some namespaces, and just reference them like here: But for other cases I don't know what the DEF does. I do think it should be trivial to support in the code, so I copied the style from the other example I could find: I also found out that my example was bad: That is a dbpedia-owl:ChemicalCompound (using Template:Chembox), so I needed to add a mapping for Unii there too. Did you find any incompatibility between the live and the new extraction Perhaps Dimitris could also help here? Any idea why this is not showing? Cheers, Pablo [1] On Tue, Jul 31, 2012 at 3:08 PM, Richard Boyce < > wrote: uOn Jul 31, 2012 3:08 PM, \"Richard Boyce\" < > wrote: example - would you please confirm that these are acceptable? uWhoopssorryI was testing the Linked Data URIs and didn't even check the SPARQL for typos. This is what puzzles me. I can see extraction samples for fdaUniiCode showing: Amoxicillin, Amphetamine, etc. But I cannot see them on live.dbpedia.org: What gives? Cheers, Pablo On Tue, Jul 31, 2012 at 4:19 PM, Jona Christopher Sahnwaldt < uLive extracts data from pages as soon as they change on Wikipedia. other pages using the template probably haven't. I think I read somewhere that live also re-extracts unchanged pages when a template they use or its mapping change, but with lower priority. Mohamed would know the details. JC On Jul 31, 2012 4:29 PM, \"Pablo N. Mendes\" < > wrote: uIs there a log anywhere that might help a user know which pages have been updated and when? tx, -R On 07/31/2012 12:11 PM, Jona Christopher Sahnwaldt wrote: uThis did the trick - thank you! Now testing for completeness. -R uOn 07/31/2012 09:40 AM, Pablo N. Mendes wrote: I plan to finish most other mappings with the help of students in the coming two months> I am not sure what is the correct usage of owl:equivalentProperty, though: substituted for each other. Since and are exactly the same I think it works> uHi Richard, I am familiar with the semantics of owl:equivalentProperty, just unsure about its usage (more clearly, I do not know exactly how to add a DBpedia mapping that uses that property, particularly for the case where the namespace of the object is not defined on the DBpedia ontology server). I do not know if namespaces are hardcoded (written in Scala) inside the DEF (DBpedia Extraction Framework), or if there is a way to define them on the wiki (mappings.dbpedia.org). I hope the reduction of jargon per sentence makes the message clearer. :) Cheers Pablo On Aug 9, 2012 9:57 PM, \"Richard Boyce\" < > wrote: uHi Pablo On 08/09/2012 04:36 PM, Pablo N. Mendes wrote: This helps a great deal. Would it be possible and advisable (i.e., not too complicated) for my group to set up DEF locally and run it over current Wikipedia? I see instructions for Ubuntu 10 ( long as things work on Ubuntu 12) it might be the most efficient way to get an updated DrugBox data extraction that uses the new mappings. Perhaps the DEF code could be grepd to find if and where namespaces are defined in DEF. If it is feasible and advisable, would anyone on this list know how to limit the extraction to just Drug pages? kind regards, -Rich Hi Pablo On 08/09/2012 04:36 PM, Pablo N. Mendes wrote: Hi Richard, I am familiar with the semantics of owl:equivalentProperty, just unsure about its usage (more clearly, I do not know exactly how to add a DBpedia mapping that uses that property, particularly for the case where the namespace of the object is not defined on the DBpedia ontology server). I do not know if namespaces are hardcoded (written in Scala) inside the DEF (DBpedia Extraction Framework), or if there is a way to define them on the wiki ( mappings.dbpedia.org ). I hope the reduction of jargon per sentence makes the message clearer. :) This helps a great deal. Would it be possible and advisable (i.e., not too complicated) for my group to set up DEF locally and run it over current Wikipedia? I see instructions for Ubuntu 10 ( (so long as things work on Ubuntu 12) it might be the most efficient way to get an updated DrugBox data extraction that uses the new mappings. Perhaps the DEF code could be grepd to find if and where namespaces are defined in DEF. If it is feasible and advisable, would anyone on this list know how to limit the extraction to just Drug pages? kind regards, -Rich" "N-Triples syntax errors in DBpedia" "uI stumbled upon this on Andy Seaborne's Twitter. Might be useful to stomp out a bug or two. It's a log of the results of parsing the DBpedia dumps with a pretty strict N-Triples parser: ~25k errors, which is not too bad for a 100M+ dataset. Best, Richard uDear All, 1) Is there any classification (groupsetc) of those errors? 2) Should we start looking for some kind of \"semantic errors\" in near future? Best regards, Vladimir 2010/4/15 Richard Cyganiak < >:" "Referencing DBpedia resources in blog entries" "uHi, I think this is a simple question, but I was wondering what the best way to link to to a DBpedia URI is in a blog entry in a semantically meaningful way. For example, if I was writing a blog entry that contained a reference to John Wayne, what would be the preferred way to link to the DBpedia page ( would be useful for something like Operator or Fuzzbot (i.e. be useful in a machine-parsable fashion). Thanks, Rob urobl wrote:" "The case of the missing abstracts" "uI've found at least 45,000 cases where en-language abstracts don't appear to be available in dbpedia. A typical example is Note that you can see dbpedia-owl:abstract and rdfs:comment information about this topic in de and fr, but nothing in en, despite the fact that there's a wikipedia page about this topic. Here's the page in wikipedia If I had to hazard a guess about what's going on here, there's a 'contents' box up at the top of the page, but the actual abstract you want is under the 'General' heading, and reads something like \"The university hospital of Heidelberg is one of the largest and most renowned medical centers in the Federal Republic of Germany. It is closely linked to Heidelberg University Medical School (Heidelberg University Faculty of Medicine) which was founded in 1388 and is thus the oldest within the Federal Republic of Germany. The university hospital is actually made up of 12 hospitals, most of them being situated on the New Campus (University of Heidelberg), about 10 driving minutes away from the old town.\" Is this just a case that the extractor isn't handling correctly. I can send a list of many more of these if it would help. uPaul, To me this looks rather like an issue for Wikipedia. Our extractor extracts the text before the first heading of an article as abstract and this corresponds to the suggested authoring style in Wikipedia [1]. So I have the impression that your 45k articles should be send as input to the Wikipedia authors as a list of articles, which should be checked for proper abstract handling. Given the 3.5M articles overall this is just slightly more than 1% of all articles. We could of course also try to implement some workaround in DBpedia, when we start doing that for all issues in Wikipedia, we will never be done. Would be interesting how many of the 45k articles without abstract use \"General\" (or something similar) as their first headline - maybe we could then try to cover at least some of these obvious and common cases, although I thing it would be more sustainable to write a bot for Wikipedia, which deletes \"General\" headings, when they come first in an article. Best, Sören [1] Am 24.02.2011 17:21, schrieb Paul Houle: uOn 2/24/2011 11:38 AM, Sören Auer wrote: For the heck of it you can get the list here To be fair this is incomplete (it happens to only contain topics for which I could find photographs and decode the metadata) and might be a little out of date I got some abstracts out of a dump of the last version of dbpedia and some have come from me doing SPARQL queries so I've got whatever is there. I think a large majority of the ones that I see as missing are still missing. I've noticed that Freebase seems to have extracted abstracts for many of these, so their extractor may have a better heuristic. It also could be that some Wikipedia pages have been extractable at different times and Freebase might have just caught them at a different time. Any idea who might be responsible for this at Wikipedia? uHello, On 24.02.2011 19:07, Paul Houle wrote: For questions about bots and technical improvements: on the top there are pointers for other rooms. If you are planning to extend the software and implement a better heuristic, please tell me and I will give you write access to the Mercurial, so you can commit the code back and improve DBpediaRegards, Sebastian" "Reproducing the Spotlight Demo" "uI am having issues reproducing the spotlight demo locally. My version doesn't recognize the same words as the on-line demo. What are likely causes? Here's the demo from the DBPedia User manual: 0Wednesday%20on%20Congress%20to%20extend%20a%20tax%20break%20for%20students% 20included%20in%20last%20year%27s%20economic%20stimulus%20package,%20arguing %20that%20the%20policy%20provides%20more%20generous%20assistance. &confidence;=0.2&support;=20 Here's my version: sident%20Obama%20called%20Wednesday%20on%20Congress%20to%20extend%20a%20tax% 20break%20for%20students%20included%20in%20last%20year%27s%20economic%20stim ulus%20package,%20arguing%20that%20the%20policy%20provides%20more%20generous %20assistance. &confidence;=0.2&support;=20 I've updated the files required for the 0.5 release: . Disambiguation index (Lucene) (Large file used) . Spotter lexicon (~LingPipe dictionary) (Large file used) . Spot selection cooccurrence model: . OpenNLP models for NERSpotter and OpenNLPNGramSpotter Here's my server properties file: I am running on Ubuntu 13.x on AWS - High MemXL - 17.1 Gigs of ram. Thanks for any suggestions uHi Mark, The right list for DBpedia Spotlight questions is We've had a similar thread in the past. Although we did not get to the bottom of the issue, there is a lot of info there that may be useful for you. Cheers, Pablo On Sat, May 18, 2013 at 8:45 PM, Mark Gamache < > wrote: uThanks Pablo. I saw that thread (I think) and I didn't see anything that would help. I'll post to the other list. From: Pablo N. Mendes [mailto: ] Sent: Sunday, May 19, 2013 3:44 PM To: Mark Gamache; DBpediaSpotlight Users Cc: Subject: Re: [Dbpedia-discussion] Reproducing the Spotlight Demo Hi Mark, The right list for DBpedia Spotlight questions is We've had a similar thread in the past. Although we did not get to the bottom of the issue, there is a lot of info there that may be useful for you. Cheers, Pablo On Sat, May 18, 2013 at 8:45 PM, Mark Gamache < > wrote: I am having issues reproducing the spotlight demo locally. My version doesn't recognize the same words as the on-line demo. What are likely causes? Here's the demo from the DBPedia User manual: 0Wednesday%20on%20Congress%20to%20extend%20a%20tax%20break%20for%20students% 20included%20in%20last%20year%27s%20economic%20stimulus%20package,%20arguing %20that%20the%20policy%20provides%20more%20generous%20assistance. &confidence;=0.2&support;=20 Here's my version: sident%20Obama%20called%20Wednesday%20on%20Congress%20to%20extend%20a%20tax% 20break%20for%20students%20included%20in%20last%20year%27s%20economic%20stim ulus%20package,%20arguing%20that%20the%20policy%20provides%20more%20generous %20assistance. &confidence;=0.2&support;=20 I've updated the files required for the 0.5 release: . Disambiguation index (Lucene) (Large file used) . Spotter lexicon (~LingPipe dictionary) (Large file used) . Spot selection cooccurrence model: . OpenNLP models for NERSpotter and OpenNLPNGramSpotter Here's my server properties file: I am running on Ubuntu 13.x on AWS - High MemXL - 17.1 Gigs of ram. Thanks for any suggestions" "What happened to the US cities in ?" "uYesterday I tried to get dbpedia URIs for major US cities as examples for a discussion I was having. I couldn't find most of the larger US cities via lookup.dbpedia.org. I did find all of the cities outside the US I searched for. Check out these queries: MISSING CITIES http://lookup.dbpedia.org/query.aspx?q=atlanta http://lookup.dbpedia.org/query.aspx?q=san%20antonio http://lookup.dbpedia.org/query.aspx?q=seattle http://lookup.dbpedia.org/query.aspx?q=dallas PRESENT CITIES US http://lookup.dbpedia.org/query.aspx?q=washington%20dc http://lookup.dbpedia.org/query.aspx?q=new%20york http://lookup.dbpedia.org/query.aspx?q=philadelphia http://lookup.dbpedia.org/query.aspx?q=portland http://lookup.dbpedia.org/query.aspx?q=peoria NON-US http://lookup.dbpedia.org/query.aspx?q=london http://lookup.dbpedia.org/query.aspx?q=rome http://lookup.dbpedia.org/query.aspx?q=paris http://lookup.dbpedia.org/query.aspx?q=berlin http://lookup.dbpedia.org/query.aspx?q=amsterdam http://lookup.dbpedia.org/query.aspx?q=cairo http://lookup.dbpedia.org/query.aspx?q=tokyo http://lookup.dbpedia.org/query.aspx?q=bejing uTim Finin wrote: Tim, Did you try: http://dbpedia.org/fct ? Type in: City Name Filter by Type, Then click on show values or distinct values by count to see matching entity identifiers. uHi Tim, thanks for the heads-up, I'll look at it. Cheers, Georgi uHi Tim, I've fixed the bug in lookup.dbpedia.org and updated the index to the DBpedia 3.4 dataset. Cheers, Georgi" "Article : Setting up a DBPedia SPARQL mirror with Jena" "uHi I wrote this about how to make a mirror of DBpedia 3.8 with Jena TDB, and optionally Jena Fuseki . sparql-dbpedia.html I hope it can be useful to both list users. Comments welcome" "DBpedia mappings" "uHi, I am new at DBpedia world, but I want to get invold and learn more about it. I am interesting is it possible to mapping Infobox of Croatian edition of Wikipedija ( ontology? If it possible, what is the best way to do this? Thanks! Hi, I am new at DBpedia world, but I want to get invold and learn more about it. I am interesting is it possible to mapping Infobox of Croatian edition of Wikipedija ( Thanks! uHi Ivana, thank you very much for your interest in DBpedia and for offering to create new mappings. We have set up a wiki namespace for Croation mappings. So you can now start to write mappings on the mappings wiki, using the Mapping_hr namespace. On the wiki you can also find a documentation on how to write mappings and you can have a look at the English and Greek mappings. There is also a useful Validate button on the edit pages that checks the syntax and mapping validity regarding the DBpedia ontology. Thanks a lot for contributing! Cheers, max On Fri, Jun 4, 2010 at 4:27 PM, Ivana Sarić < > wrote:" "dbpedia URI change in retrieved resource files" "uHello, I have been using DBpedia for a little while now, and something important has changed a few minutes ago: If I retrieve information about a resource using it's JSON or XML representation all the dbpedia URI's in the file are different than what they previously were. Instead of starting with \" they all now start with \"local:/\". Was this intentional? If not, when can we expect to have it fixed? Thanks in advance! Kind regards, Zoltan uHi Zoltan, We are looking into this issue right now. I will advice as soon as it is fixed. Patrick uHi Zoltan, The problem has now been fixed. Patrick uHi Patrick, Indeed, it is working correctly now. Thank you for the fast response! Zoltán On 2011.08.25. 15:14, Patrick van Kleef wrote:" "sparql limit and offset max value" "uin private static final String QUERY_STRING_FORMAT = //\"PREFIX rdfs: \" + //\"SELECT ?l WHERE {\" + //\"[] rdfs:label ?l.\" + //\"}\" + //\"LIMIT %d \" + //\"OFFSET %d \"; limit and offset are limited to a maximum value of 2 147 483 648 but, what is the maximum acceptable value ? Best regards Luc DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" in private static final String QUERY_STRING_FORMAT = //\"PREFIX rdfs: < + //\"SELECT ?l WHERE {\" + //\"[] rdfs:label ?l.\" + //\"}\" + //\"LIMIT %d \" + //\"OFFSET %d \"; limit and offset are limited to a maximum value of 2 147 483 648 but, what is the maximum acceptable value ? Best regards Luc uOn 19/08/11 04:04, luc peuvrier at home wrote: Theoretically there is no maximum acceptable value, but in practice this is determined by the SPARQL engine you use and the programming environment in which it runs. Your example is in Java, the number set here is 2^31 which is the maximum value (or actually one above the max value) of a 32 bit signed integer (that is, a Java int). If you need a higher value than that (yes, it's been known to happen), there are several high-performance SPARQL processors that implement limit and offset by a 64-bit signed integer ( = a Java long), meaning the max value is 2^63 - 1. HTH, Jeen" "DBpedia Mappings: Request for editor rights" "uDear DBpedians, my name is Norman Weisenburger I am master student at the university and currently looking at DBpedia in the context of my thesis. I’ve had a closer look on the company infobox type. According to the DBpedia ontology airlines are subclasses of companies. Therefore they inherit properties like assets, netIncome, operatingIncome, revenue, numberOfStaff… However, beside revenue, these properties are not mapped, i.e. not extracted. Examples where an extension of the mappings would extract further triples are: Same applies for other subtypes of company, e.g. publisher -> revenue Therefore I would like to request editor rights in order to enhance these mappings according to the Mappings wiki ( Thanks in advance! Best regards Norman Dear DBpedians, my name is Norman Weisenburger I am master student at the university and currently looking at DBpedia in the context of my thesis. I’ve had a closer look on the company infobox type. According to the DBpedia ontology airlines are subclasses of companies. Therefore they inherit properties like assets, netIncome, operatingIncome, revenue, numberOfStaff However, beside revenue, these properties are not mapped, i.e. not extracted. Examples where an extension of the mappings would extract further triples are: Norman uHi Norman, Welcome to the DBpedia project, we appreciate your interest in improving the mappings. Before giving you editor rights we need to know your username in the mappings wiki. I assume it is \"norman\", but you need to confirm it. Cheers, Alexandru On Apr 14, 2014 12:11 PM, \"Norman Weisenburger\" < > wrote:" "English Words" "uIain Sproat wrote: Here's a better example of a topic that has two meanings, or Wikipedia goes right out and says it\"This article is about the chemical element and its most stable form, O_2 or dioxygen. For other forms of this element, see Allotropes of Oxygen.\" This is annoying because you can't make entirely truthful statements about \"Oxygen\" if you conflate the element and the diatomic gas. For instance, most of the mass of the ocean (water) is the ElementOxygen. If a system also understood that people breathe \"Oxygen\" it could come to the wrong conclusion that people can breathe in the ocean. Note that freebase treats oxygen as a \"Chemical Element\", but also a \"Medical Treatment\"; the Medical Treatment is the use of the diatomic gas, which doesn't appear to be otherwise documented in Freebase. The text in wikipedia does a good job at explaining the taxonomy of substances: reading it, it makes clear distinctions between elements, compounds, allotropes, etc. So far, generic databases have done a poor job of taxonomizing \"stuff\". It seems to me that the problem is tractable, but people have stopped short of the work it takes to do it: an introductory chemistry textbook does a good job of explaining it that bypasses the \"representational thorns\" that Cyc and other efforts have gotten caught up on. u[I left dbpedia-discuss on the distribution, but I'm not sure why they got tacked on at the very end of the conversation. They'll probably need to go check the data-modeling archives to get caught up on what the conversation was about.] On Tue, Aug 18, 2009 at 10:01 AM, Paul Houle< > wrote: I think Iain was referring to the defined intent of a Freebase topic and a synset. The example that you've identified represents a bug where the difference between the definition of a Wikipedia article (whatever the editors wants it to be) and a Freebase topic (a single concept) hasn't yet been cleaned up. Wikipedia is what it is, so it's really up to the users of structured data to decide how much effort they want to put into tidying things up. Unfortunately, there are lots of folks who would be happy to make use of a well structured data set when it's done, but not a lot of folks (so far?) who are interested in contributing time and effort to making it better structured. Unless someone figures out how to break the circle, we'll all be left bemoaning the inadequate and \"annoying\" state of the data. So did you click the little \"split\" flag on the edit page? Better yet, did you use one of the split tools to cleave the properties into two sets and distribute them among the appropriate topics for the gas and the chemical element? I agree that \"people\" need to deal with this, but \"people\" includes everyone with a vested interest. If you need better data for your app, that includes you. Tom" "No result for Category-label in English" "uHello, when i put the query" "Geocoordinate issue on de & fr dbpedia" "uHi, There has been word [1] for several months now that geocoordinates on de.dbpedia get rounded to integers and are thus useless. I've found the same issue on some infoboxes in fr.dbpedia. Is this a known issue and is there a fix planned? The root cause is that an unhandled format (dd/mm?/ss?/[NSEW]) is used for coordinates. Indeed the GeoCoordinatesMapping class assumes that when there are only two properties for latitude/longitude they are necessarily decimal, which is not true. It then calls doubleParser.parse which results in keeping only the degree part. The same format is also used by the {{coordinate}} template [2] which is said [3] to be the current standard in de.wikipedia and could thus be added to GeoCoordinateParser and GeoCoordinateParserConfig once the issue is fixed. Here are two test cases, first with infobox (rounded), second with {{coordinate}} template (ignored): * * If nobody is addressing this issue I will try to submit a patch later this year, since I just had my very first look into extraction framework and am not familiar with scala. Cheers, Nono {1] [2] [3] comparison uHi Nono, Yes there are still problems for coordinates not given as doubles; As I understand there is a regex in for expressions fo the kind 43°50'51''N 18°21'23''E, bit not for the format you say. A solution may be to add a second regex. The template \"coordinate\" is missing in I see no issue related on I can help if you need. Cheers, Julien uThank you for your answer Julien. Yes, my idea was to add a second regexp like private val SingleCoordinate = \"\"\"([0-9]{1,2})/([0-9]{0,2})/([0-9]{0,2}(?:\.[0-9]{1,2})?)?/([NSEW])\"\"\".r Then the simplest option would be: 1) insert a new case entry leveraging it twice at before the basic latitude/longitude one, to handle the {{coordinate}} template once added to GeoCoordinateParserConfig. 2) add a match test at resorting to doubleParse.parse() However this is a bit against encapsulation rules. Having everything parsing related inside the parser class would be nicer but we cannot directly return a GeoCoordinate instance since we need to parse one coordinate at a time. If you want to implement one of the solutions, feel free to do so. Cheers, Nono On Fri, Jun 21, 2013 at 11:07 AM, Julien Cojan < >wrote:" "DBpedia HTTPS Support?" "uHey Everyone, Just wondering why dbpedia.org does not support HTTPS? Are any plans to host DBpedia in HTTPS in the near future? Cheers, Sam Smith Hey Everyone, Just wondering why dbpedia.org does not support HTTPS? Are any plans to host DBpedia in HTTPS in the near future? Cheers, Sam Smith u0€ *†H†÷  €0€1 0 + uI do find the idea of HTTPS as an option useful, beyond ACLs, from a privacy standpoint. Now that I think about it, I don't want everyone between me and DBpedia being able to find out exactly which resources I access, what sparql queries I write, what results I get etc. At this point we can be sure that all this is being logged by all kinds of agencies. Cheers, Alexandru On Apr 16, 2014 3:10 PM, \"Kingsley Idehen\" < > wrote: u0€ *†H†÷  €0€1 0 +" "a few questions about dbpedia extraction" "uHi, First, thank you for the great effort to make dbpedia available and improve it continuously. I also see some places that might need attention. Question1: I wonder why the the ontology property \"isPartOf\" is used to connect dbpedia:Ocean_City,_Maryland and dbpedia:Maryland but not the more informative property \"state\", which is actually shown in the wikipedia page. On the other hand, dbpedia:Thibodaux,_Louisiana does relate to dbpedia:Louisiana using \"state\". Would it be better to consistently use \"state\"? Question2: There is no the class \"State\" in the dbpedia owl ontology, which is very inconvenient. Morevoer, there are many places which are cities but do not have \"city\" type, for example, dbpedia:Beijing. The lowest dbpedia-owl class for \"Beijing\" is still \"Settlement\"! Thanks, Lushan Hi, First, thank you for the great effort to make dbpedia available and improve it continuously. I also see some places that might need attention. Question1: I wonder why the the ontology property 'isPartOf' is used to connect dbpedia:Ocean_City,_Maryland and dbpedia:Maryland but not the more informative property 'state', which is actually shown in the wikipedia page. On the other hand, dbpedia:Thibodaux,_Louisiana does relate to dbpedia:Louisiana using 'state'. Would it be better to consistently use 'state'? Question2: There is no the class 'State' in the dbpedia owl ontology, which is very inconvenient. Morevoer, there are many places which are cities but do not have 'city' type, for example, dbpedia:Beijing. The lowest dbpedia-owl class for 'Beijing' is still 'Settlement'! Thanks, Lushan uOn 28 October 2011 23:09, Lushan Han < > wrote: Ocean_City,_Maryland uses Infobox_Settlement, which has no mapping for 'state', while Thibodaux,_Louisiana uses Geobox, which does. Infobox_Settlement (for that page) has: |subdivision_type1 = [[Political divisions of the United States|State]] |subdivision_name1 = [[Maryland]] which requires a set of conditional mappings, repeated for each of the numbers appended, while Geobox has: | state = Louisiana which is trivial to map. Again, that uses Infobox_Settlement: |settlement_type = [[Direct-controlled municipality of the People's Republic of China|Municipality]] There was no mapping for 'municipality', so I've added it. It's still not 'city', as you expected, but that's down to Wikipedia. Hope that helps. uOn 10/28/2011 6:09 PM, Lushan Han wrote: One of the reasons I like the DBpedia Ontology is that it ducks the difficult issues; this makes it more possible for DBpedia to be populated with large amounts of accurate instance data, at least in theory. A correct ontology of administrative divisions is a difficult problem because the laws that define administrative divisions are different in different countries and even inside smaller level administrative divisions. In a week or so I'm going to vote for elected officials in an administrative division called Caroline. Mail to me is addressed to something called Brooktondale which has no government but does have government buildings and a post office. Another village, Slaterville Springs, has no government but does have a special tax assessment. New york state has nearly 20 different types of local government (what Freebase would call a /location/citytown.) Every point in New York, including the great wilderness of the Adirondacks, is part of a municipality (4th level administrative subdivision), but there are places in New Mexico that are unincorporated and that only belong to a county (3rd level administrative subdivision.) Perhaps the U.S. is a more difficult case than most other countries because a :U.S._State has a high degree of autonomy. Medicaid, our program that provides health care for the indigent is administered by individual states, and supported by taxes that I pay to my County. In most U.S. States I can buy a bottle of wine at the supermarket, but in New York I have to go to a liquor store and not on Sunday, except for the one liquor store in town that (inexplicably) is open on Sunday. Don't forget that the U.S. also has non-:State components such as DC, Puerto Rico, and Guam. In other countries, a \"2nd level administrative division\" means something very different. Since there are another 200 or so countries \"1st level administrative divisions\" in the world (nobody is sure about the exact right number) the problem is way worse than that. In geonames, there are \"Nth level administrative divisions\" where N=14, and maybe if you let N go to 5 or 6 sometimes, you're doing about as well as can be done for the world. Now you might think :City isn't so bad but that's not the case. :Los_Angeles is a :City that is inside a :County. :New_York_City is really a :City (4th level) but is has five counties (3rd level) inside of it. :London and :Tokyo (which everyone expects to be on the top-10 :City list) aren't even cities, they are administrative subdivisions that contain cities And there's no clear distinction between a :Village, :Town, and :City, any more than there is between a :Hill and a :Mountain. Now, it's a fair complaint that the extraction system behind DBpedia is less than perfect, but I think the ontology behind it is correctly designed and detailed for the job it has to do. \"Vernacularizing\" it so that :Tokyo is a :City and \"specializing\" it so it knows everything about administrative divisions in Indonesia are interesting enrichment projects, but what it aims to do right now is about as useful as is practical." "java.util.NoSuchElementException: key not found: dct:subject" "uHello, I was running dbpedia extraction against zhwiki 20160203 data and I got key not found error. Below is the config file and error messages. Not sure if it is related recent change on the framework. Could anyone advise how to do? Thanks. zhwiki:  $ /run extraction extraction.properties INFO: Ontology loadedException in thread \"main\" java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at org.dbpedia.extraction.mappings.CompositeParseExtractor$$anonfun$2.apply(CompositeParseExtractor.scala:73) at org.dbpedia.extraction.mappings.CompositeParseExtractor$$anonfun$2.apply(CompositeParseExtractor.scala:73) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) at scala.collection.immutable.List.map(List.scala:284) at org.dbpedia.extraction.mappings.CompositeParseExtractor$.load(CompositeParseExtractor.scala:73) at org.dbpedia.extraction.dump.extract.ConfigLoader.org$dbpedia$extraction$dump$extract$ConfigLoader$$createExtractionJob(ConfigLoader.scala:122) at org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$getExtractionJobs$1.apply(ConfigLoader.scala:40) at org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$getExtractionJobs$1.apply(ConfigLoader.scala:40) at scala.collection.TraversableViewLike$Mapped$$anonfun$foreach$2.apply(TraversableViewLike.scala:169) at scala.collection.Iterator$class.foreach(Iterator.scala:743) at scala.collection.immutable.RedBlackTree$TreeIterator.foreach(RedBlackTree.scala:468) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.IterableLike$$anon$1.foreach(IterableLike.scala:310) at scala.collection.TraversableViewLike$Mapped$class.foreach(TraversableViewLike.scala:168) at scala.collection.IterableViewLike$$anon$3.foreach(IterableViewLike.scala:113) at org.dbpedia.extraction.dump.extract.Extraction$.main(Extraction.scala:30) at org.dbpedia.extraction.dump.extract.Extraction.main(Extraction.scala)Caused by: java.util.NoSuchElementException: key not found: dct:subject at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:59) at scala.collection.MapLike$class.apply(MapLike.scala:141) at scala.collection.AbstractMap.apply(Map.scala:59) at org.dbpedia.extraction.mappings.ArticleCategoriesExtractor. (ArticleCategoriesExtractor.scala:17) 24 more Thanks,Melissa Hello, I was running dbpedia extraction against zhwiki 20160203 data and I got key not found error. Below is the config file and error messages. Not sure if it is related recent change on the framework. Could anyone advise how to do? Thanks. zhwiki: Melissa" "Current state of the SPARQL endpoint" "uHello, as pointed out earlier, I'm having some issues with the new SPARQL endpoint. I'm currently using DBpedia to generate dictionaries for a task in an information extraction class I'm taking. For this task, I need a list of entities, e.g. actors. Consider the following query: SELECT ?name WHERE { ?a . ?a ?name } With DBpedia 3.2, this would work just fine. With the current release, the query will time out after a while, giving me a partial result list. This actually is a feature called \"Anytime\" queries. [0] I wonder if enabling the Anytime feature is a good idea - not because I can't get my list of actors, but because it's broken, undocumented and proprietary: Kingsley Idehen wrote: That's not exactly in the \"SPARQL Protocol for RDF\" recommendation. There is no way right now to let a SPARQL-compliant client know there are more results. AFAIK, it is also impossible to set these timeouts using the SPARQL Protocol. I don't think proprietary protocol extensions are the right thing for an Open project. Additionally, handing out different result sets for the same query depending on what kind of data is cached and how far subordinate clauses from *previous* queries have been evaluated (see [0]) sounds broken. In fact, I don't believe the SPARQL W3C recommendation allows that (section 12.5, \"Evaluation Semantics\"). I do acknowledge that handling web-scale data sets presents a problem, but I'd rather see a query language which can do proper chunking of results instead of breaking SPARQL. Anyways - I tried to work around this issue by using the LIMIT and OFFSET solution sequence modifiers. The W3C recommendation states: \"Using LIMIT and OFFSET to select different subsets of the query solutions will not be useful unless the order is made predictable by using ORDER BY.\" - so throw in an ORDER BY as well. This will break after some iterations: 22023 Error SR353: Sorted TOP clause specifies more then 10100 rows to sort. Only 10000 are allowed. Either decrease the offset and/or row count or use a scrollable cursor SPARQL query: define sql:signal-void-variables 1 define input:default-graph-uri SELECT ?name WHERE { ?a . ?a ?name } ORDER BY ?name LIMIT 100 OFFSET 10000 Any ideas? Regards, Michael [0] ?id=1494 uMichael Haas wrote: Micheal, The only issue here is that it isn't documented properly. And the real question is simply this: how do you answer queries at Web Scale? Do you seriously expect a SPARQL endpoint to deliver complete results for every conceivable query (good, bad, complex) in predictable time, really? If you prefer the old behavior, no problem, we switch off this feature, but don't jump to conclusions re. its purpose. btw - I could simply have opted for the basic single server edition of Virtuoso, and in retrospect, maybe that's what should be out there. Do remember, we are offering a service at our cost to the community, please do remember this. Kingsley uKingsley Idehen wrote: Thanks for the follow-up. However, It's still not working. With DBpedia 3.2, I got ~25000 results: :~/uni/IE/group3$ wc -l oldseeds/actors.txt 26932 oldseeds/actors.txt Now, it's more like ~150: :~/uni/IE/group3$ wc -l seeds/actors.txt 132 seeds/actors.txt (Please note these numbers are after post-processing, ie removing duplicates etc). This is the exact same source code and queries I'm using. Either the amount of actors in Wikipedia reduced dramatically, the extractors are broken or something is not right with the SPARQL endpoint. Thanks for the update - you previously suggested it was the \"Anytime Query\", that's why I was looking into that. Regards, Michael uWell, at least I can say it's neither the amount of actors in Wikipedia nor the dbpedia extractors. grep -c 30912 Best, Georgi uMichael, Please retry: SELECT ?name WHERE { ?a . ?a ?name } The issue has nothing to do with \"Anytime Query\" feature and everything to do with misconfiguration of the Virtuoso cluster instance. uKingsley Idehen wrote: The real issue is that it is not SPARQL. Some things you're doing are not in the SPARQL W3C recommendation. Handing out partial results to SPARQL results is what I would consider broken behaviour, especially as there is no way to tell the SPARQL client that there more results available. SPASQL does not count, and HTTP response (codes) are not specified in the W3C recommendation (other than 200, 400 and 500 - as far as response codes are concerned, of course). As I said, I acknowledge this problem. It is not easy to solve and I do not have a good solution handy. There is no need to be offended. I'd prefer a technical discussion - I'm sorry if I have insulted you. From your follow-up: > BTW - set the timeout to \"0\" and the \"Anytime Query\" feature is disabled re. the endpoint. Can you please point me to the part of the SPARQL Protocol recommendation [0] where it talks about setting timeouts or somesuch? It might be a good idea to disable the anytime query feature for those using the SPARQL protocol and do some kind of opt-in for those using clients who can speak your extensions. I'd rather have a HTTP 500 when my queries fail instead of getting possibly wrong data without being told there's something missing. Of course, I could all be wrong and what you're doing is standards compliant - feel free to prove me wrong :) [0] uMichael Haas wrote: Embracing and coherently extending the SPARQL standard isn't a crime as long as you don't break the core standard. Do you seriously think we should wait for consensus re. how and when SPARQL matches and exceeds SQL (which is darn important)? btw - I am sure you've seen: and rewind to a query language without aggregates and update capabilities as part of a message that reads: Web as a structured database that facilitates powerful analytics etc Kingsley uGeorgi Kobilarov wrote: uKingsley Idehen wrote: How is 40000 records out of almost 100000 a good result? Especially when obtaining large result sets was no problem two weeks ago? > (amply generous since you are clearly > crawling this data). Is that not a valid use case? Queries similar to \"Give me all actors\" are even mentioned on dbpedia.org [0]. In any case, extracting data for (computational) linguistic purposes sounds like a good use case for an ontology built from Wikipedia. Also, accusing me of \"crawling\" this data is a bit ironic considering how future DBpedia versions will be built. Regards, Michael [0] OnlineAccess#h28-4 uKingsley Idehen wrote: How is 40000 records out of almost 100000 a good result? Especially when obtaining large result sets was no problem two weeks ago? > (amply generous since you are clearly > crawling this data). Is that not a valid use case? Queries similar to \"Give me all actors\" are even mentioned on dbpedia.org [0]. In any case, extracting data for (computational) linguistic purposes sounds like a good use case for an ontology built from Wikipedia. Also, accusing me of \"crawling\" this data is a bit ironic considering how future DBpedia versions will be built. Regards, Michael [0] OnlineAccess#h28-4 uDbpedia resources have multiple rdfs:label, one for every language. You need to use a filter to only get @en triples. Georgi uKingsley Idehen wrote: I just wanted to echo what Kingsley said here and add one caveat of my own. A *key* part of driving a healthy standards process is extension & innovation performed by implementations. If implementations blindly adhered to the standard, there would be no experience to draw on to motivate and coalesce around designs for future specifications. The sort of innovation done by Virtuoso and many other SPARQL engines is key to a healthy future for SPARQL. The one caveat is that I do agree that, in general, extensions really ought to be compatible with the core standard. This means that if a query is asked or a request made via the SPARQL protocol, it's almost always best for the implementation to behave as per the spec. This allows generic SPARQL clients to safely and consistently access a service, while other clients with knowledge of your extensions can still take full advantage of them. Lee uLee Feigenbaum wrote: Lee, To clarify I meant: extend cleanly i.e, above the core. Thus, people that want to work with the core can do so without hitting the basic vs extension quandary due to poor partitioning of a given extension and the core :-) Exactly! uLee Feigenbaum wrote: That is my opinion as well - and it basically echos what I just wrote to Kingsley in an private email. Regards, Michael uKingsley Idehen wrote: Then can you please explain how your normal run-off-the-mill SPARQL client implementing the SPARQL protocol can turn off \"Anytime Queries\"? It's not like I haven't asked this before, maybe I'll get an answer now :) Regards, Michael uHi Michael, If SPARQL client do not specify 'timeout' parameter, then no \"Anytime Queries\" will be in effect. If there is noticeable difference between 3.2 and 3.3 SPARQL endpoints results it is not \"Anytime Queries\" unless 'timeout' option is used, imho it is something to be debugged. Best Regards, Mitko On Jul 4, 2009, at 1:27 AM, Michael Haas wrote: uHi Michael, The 40k limit applies only to sorted/order by queries e.g. The limit is same as for 3.2 version of DBpedia. Best Regards, Mitko On Jul 4, 2009, at 12:54 AM, Michael Haas wrote: uMichael Haas wrote: I didn't say or imply you get 40K out of a 100K. I said and implied, that's the size of your data window. You continue to miss my points, profoundly. Please step back, and then re-read this entire thread, and then you will more than likely discover the source of my angst re. your comments. Kingsley uMichael Haas wrote: If you recall, in the prior release of DBpedia we would return timeouts for certain queries. Do you recall that behaviour? \"Anytime Query\" is about a new dimension that allows you to set timeouts and retries so that you can do more. As with most things, in a few years time when \"Anytime Query\" is implemented elsewhere by other Databases that offer query parallelization, you will at least have a URI back to this thread :-) Also remember, I told you that the initial problem had zilch to do with the \"Antytime Query\" feature, it had everything to do with an error in the configuration of one of the cluster nodes. FWIW - cluster nodes are about horizontal partitioning and query parallelization, so if a node is mis-configured it can distort the behavior the whole. Kingsley uMitko Iliev wrote: Thanks. I was under the impression Anytime Queries were enabled unless you passed a specific parameter to the SPARQL endpoint, which is AFAIK not possible using the SPARQL protocol. You are the first person to clarify that. Regards, Michael uKingsley Idehen wrote: My entire point was that enabling Anytime Queries by default is a bad thing, especially when there is no way to turn off this behaviour using the SPARQL protocol. It turns out that it is disabled unless you specify the timeout parameter - if you had told me this, we'd both have saved a lot of time. In fact, you told me I needed to set specific parameters to disable it. It goes without saying that you also missed my points. I acknowledged several times that the Anytime Query feature has valid use cases, I just said it mustn't be enabled by default as it is incomptaible with the SPARQL protocol. In any case, my concerns are invalid. Thanks. Regards, Michael uKingsley Idehen wrote: As it turns out, the answer to this question is \"it doesn't need to\". There's nothing wrong with innovation. You seem to imply I am wrong - I am not. You simply missed my point :) I realize that. I was solely reacting to the fact of Anytime Queries becoming the default - which was not true. Sorry 'bout the drama. Regards, Michael uMichael Haas wrote: Micheal, Just read the text of: finput field is). You could have just clicked \"Run Query\" and seen \"&timeout;\" at the end. You could have also done a backspace and hit \"enter\" to see it is a clean extension. Instead, you decided to offer a lecture of SPARQL compliance etc. Kingsley uMichael Haas wrote: I am sure I missed some of your more salient points, but the treatise on SPARQL conformance was quite distracting, especially bearing in mind how much time and effort actually goes into making this instance available to the world. We started with a simple issue that became a two-way riddle due to a cluster node mis-configuration. Thus, in my eyes, I was naturally looking beyond the basics; assuming you had seen the text in the /sparql page and its default setting, since you were talking about large results and what appeared to be missing data etc. Great so were all set :-) uKingsley Idehen wrote: \"Execution timeout, in milliseconds, values less than 1000 are ignored\"? I did read that, but I assumed that entering a value less than 1000 would just use the default timeout of the Anytime Query feature instead of disabling it. It has only become apparent (to me, at least) that it *is* a clean extension (ie opt-in) after Mitko joined the conversation. Let me sum this up: * I'd consider it a bad thing (\"broken\" from my first posting in this thread) if Anytime Queries were enabled by default. They are not enabled by default. Everything is good. * Anytime queries are good, if supplied with a good protocol - which no doubt you're going to do. I'm looking forward to seeing some documentation on this feature - I've been working with other triple stores which shall remain nameless which would benefit from such a thing. * Weekends are good. I hope everyone is going to have a great one :) Regards, Michael uKingsley Idehen wrote: I did insist on that point because nobody dispelled my worries (till Mitko came along). ;) especially bearing in mind how And it is appreciated. I used to use the /sparql website a lot for testing queries - these days, I am mainly using the ARQ libs which don't do the Anytime Query stuff. That probably explains why I assumed it was enabled by default - because it was the first thing you suggested when I talked about missing result sets. Or it was the fact there are two things to SPARQL: the SPARQL protocol and the SPARQL query language. Yep! Have a nice weekend. Regards, Michael uMichael Haas wrote: Yes, additional participants typically fix broken conversations. That's why conversations are so important, even more so now that we are morphing the Web into a distributed and structured discussion space :-) Great :-) Kingsley" "Deploying dbpedia on other wikis" "uHi, just out curiosity: is it possible to apply dbpedia's knowledge extraction process to other mediawiki-based wikis, and if so, how much work would that entail? Cheers," "live dbpedia dumps" "uHi, I tried to get a recent dump of live dbpedia. But the last update is over two months ago. When will you generate a new dump ? Can you generate it regularly (say monthly) ? Thanks. dbpedia_2013_02_05.nt.bz2 06-Feb-2013 16:51 3.5G dbpedia_2013_03_04.nt.bz2 05-Mar-2013 01:09 3.5G uHi Xiliang, On 05/15/2013 07:41 PM, Xiliang Zhong wrote: Normally, we generate a dump of the DBpedia-Live data on a monthly basis. However, we didn't generate a dump in April, as our DBpedia-Live server had a serious hardware problem. But we will generate a dump for May very soon. Sorry for any inconvenience. uHi again, The DBpedia Live dump of May is now available. On 05/15/2013 07:57 PM, Mohamed Morsey wrote: uThank you! On Fri, May 17, 2013 at 10:50 PM, Mohamed Morsey < > wrote:" "SPARQL endpoint / Full Text Search / Problems with International Characters" "uHi all, I'm trying to integrate data from DBpedia into the German language Database of University Museums And Collections in Germany[1] (hosted and maintained at Humboldt-University Berlin/Hermann von Helmholtz- Zentrum), especially information about persons. Part of this integration is a kind of \"Discovery Service\", which tries to find relevant Wikipedia entries (URIs) for person records we have in our database of persons which are connected with academic collections[2]. For that, I try to match our person names against labels in DBpedia, using the SPARQL endpoint at Text Search option via bif:contains. Problem is, it only works, if there are no special characters like \"é\" or \"ä\" in query strings: 37000 Error SQ074: Line 6: syntax error at ',' before ''UTF-8'' Could maybe look somebody into this - is there something wrong on my end, or is it a bug? Following are 2 examples, first not working, then one working (before URL-Encoding everything is UTF-8). Thanks a lot for your help! Martin Stricker, Berlin Example not working: SELECT ?uri ?txt WHERE { ?uri rdfs:label ?txt . ?txt bif:contains \"'Reinhard' AND 'Kekulé'\" . FILTER(LANG(?txt) = \"de\")} Error message: 37000 Error SQ074: Line 6: syntax error at ',' before ''UTF-8'' SPARQL query: define sql:signal-void-variables 1 define input:default-graph-uri PREFIX : PREFIX dbp: SELECT ?uri ?txt WHERE { ?uri rdfs:label ?txt . ?txt bif:contains \"'Reinhard' AND 'Kekulé'\" . FILTER(LANG(?txt) = \"de\")} Example working: SELECT ?uri ?txt WHERE { ?uri rdfs:label ?txt . ?txt bif:contains \"'Ludwig' AND 'Darmstaedter'\" . FILTER(LANG(?txt) = \"de\")} [1] [2] personen" "Error: HttpException: 500 SPARQL Request Failed" "uHi all I'm trying to run queries to dbpedia through sparql using my application, but sometimes I get the following exception: Error: HttpException: 500 SPARQL Request Failed what is it? how can i handle it? Thanks" "How to get DBpedia updated dump for different languages" "uHi All, Greeting for the day!! I am working on some project for which I want DBpedia updated dumps for certain languages.But I could not get it except for English from DBPedia-Live ( there at ( I stuck and looking guidance from you people which will be very helpful to me. I have below queries. *\"How can we get updated dump of Ontology Info-box data and detailed description for different languages from DBpedia?\"* Please guide for the same. Thanks for giving ear to my words. uHi Gaurav, On 02/18/2013 11:37 AM, gaurav pant wrote: DBpedia-Live currently supports the English language only, so you can download dumps for English only. The full DBpedia releases are published approx twice a year. uHi again, On 02/18/2013 11:37 AM, gaurav pant wrote: You can also download the latest Wikipedia dumps as described here [1], and download the DBpedia framework [2] and run it on the dumps. uHi Gaurav, You can also use this guide to extract them yourself Best, Dimitris On Mon, Feb 18, 2013 at 1:06 PM, Mohamed Morsey < > wrote: uHi , Thanks Mohamad & Dimitris for your great help I have some more queries- 1- Is there someway so that I can get just updated records or dumps rather than whole dump for a particular language ( except English) ? 2- Can you suggest which of these dumps should I use for getting a sort description of somethings(max 500 characters) and some specificity like some accomplishment,Date of Birth,Birth place, Occupation if he is a person or so on? Again big thanks for all the support. On Mon, Feb 18, 2013 at 4:49 PM, Dimitris Kontokostas < >wrote: uHi, International Chapters may have more recent dumps. I added a column \"last extraction\" in Julien uHi, On 02/19/2013 10:51 AM, Julien Cojan wrote: uHi Morsey/Julien/All, Thanks for all the support I am * Instructions there are as follows. \" $hg clone $hg update dump $mvn clean install $cd dump $/run download config=download.properties.file $/run extraction extraction.properties.file \" when I am doing *\"mvn clean install\"* I am getting below error below errors. *with -e option(stack trace) >* \" [ERROR] Failed to execute goal org.scala-tools:maven-scala-plugin:2.15.2:compile (process-resources) on project core: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 137(Exit value: 137) -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.scala-tools:maven-scala-plugin:2.15.2:compile (process-resources) on project core: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 137(Exit value: 137) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:217) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537) at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196) at org.apache.maven.cli.MavenCli.main(MavenCli.java:141) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230) at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409) at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352) Caused by: org.apache.maven.plugin.MojoExecutionException: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 137(Exit value: 137) at org_scala_tools_maven.ScalaMojoSupport.execute(ScalaMojoSupport.java:350) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209) 19 more Caused by: org.apache.commons.exec.ExecuteException: Process exited with an error: 137(Exit value: 137) at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:346) at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:149) at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:136) at org_scala_tools_maven_executions.JavaMainCallerByFork.run(JavaMainCallerByFork.java:80) at org_scala_tools_maven.ScalaCompilerSupport.compile(ScalaCompilerSupport.java:124) at org_scala_tools_maven.ScalaCompilerSupport.doExecute(ScalaCompilerSupport.java:80) at org_scala_tools_maven.ScalaMojoSupport.execute(ScalaMojoSupport.java:342) 21 more [ERROR] [ERROR] Re-run Maven using the -X switch to enable full debug logging. \" and *with -X option >* Almost same error message as above. I am using *Ubuntu 12.04.1 LTS* & *kernal 3.2.0-36-virtual*. Please help me out to resolve this installation problem. Thanks for your precious advices. On Tue, Feb 19, 2013 at 6:06 PM, Mohamed Morsey < > wrote: uHi All, Apart from below I have checked my pom.xml file where I come across below lines. \" uCan you send us your download.properties.file && extraction.properties.file? Cheers, Dimitris On Wed, Feb 20, 2013 at 10:27 AM, gaurav pant < > wrote: uBZh91AY&SY;ؕ…° uin the extraction instructions when we say download.properties.file && extraction.properties.file we mean one of the existing download* & extraction* files or a new one you create On Wed, Feb 20, 2013 at 12:42 PM, gaurav pant < > wrote: uThanks for your quick response I have not in that step as I am getting error prior to that.Hence I have not prepared any download* or extraction* file till yet. I am following this documentation. * Here steps defined are as below. \" $hg clone $hg update dump $*mvn clean install* * uHi Gaurav, On 02/20/2013 09:27 AM, gaurav pant wrote: I believe the scala plugin is not installed on your machine. So, please try to install it either via the Synaptic-Package-Manger or with command apt-get install, and try to compile the project. uAre you using Maven2 or Maven3? we use maven3 also remove your .m2 folder and try again On Wed, Feb 20, 2013 at 1:13 PM, gaurav pant < > wrote: uHi, @Morsey- I could not get command to install *maven-scala-plugin. *Don't know from where I can download it even. I have googled the same. I come across But I don't know where should I put this. Also at man page of mvn i am not getting any help to install plugin. @Dimitris uIt is strange because we use the exact same version. Maybe it is a temporary maven repository issue or something in your system configuration Can you checkout the code from github? git clone git://github.com/dbpedia/extraction-framework.git delete your .m2 folder rm -rf ~/.m2 and try again? mvn clean install On Wed, Feb 20, 2013 at 2:03 PM, gaurav pant < > wrote: uHi, On 02/20/2013 01:03 PM, gaurav pant wrote: try this one [1]. uHi, On 02/20/2013 02:58 PM, gaurav pant wrote: This webpage describes how to configure the maven-scala-plugin [1]. Please follow the instructions, add the required parts to your pom file, and give it another try. [1] usage.html uHi Morsey/All, I have tried the same but no luck.Can you please provide me your pom.xml file? Also in my pom.xml file it is already explicitly mentioned that * uHi Gaurav, On 02/21/2013 07:03 AM, gaurav pant wrote: I'm using the pom file of the project as is, and it's compiling with no problem. Could you please send us your pom file after changing it according to what is stated here [1]?. uHi Morsey/Dimitris, I have tried a fresh installation on other machine and it finally successful. I am following the same steps but don't know why could not installed in other machine. Anyway thanks for being too patient to me I want one more help from you guys that I do not want to download using this module because i have per-existing dataset which are being downloaded.I want to extract information from \"eswiki-20130208-pages-articles.xml.bz2\" page. Can I directly do this? If there is any start up tutorial than please let me know. Also let me know if it is possible to get information into rdf format from these pages , than it will be of great help. Thanks again:) On Thu, Feb 21, 2013 at 2:03 PM, Mohamed Morsey < > wrote: uHi Gaurav, On 02/21/2013 03:05 PM, gaurav pant wrote: You can use the guide Dimitris suggested before [1], and go directly to the last step, as you have the dumps already. One more thing, the file you mentioned is the dump of the Spanish Wikipedia, so you can configure the framework to extract only from a specific language(s) by adapting file [2]. [1] [2] extraction.default.properties uHi Morsey/Dimitris/All, Can you suggest Perl based API for extracting information for these wiki media dumps as that extractor is still not working properly. My requirement is to get some basic information about the search term and a sort abstract about title. I am fine with writing a script to parse English language page but for other language i am facing lot of problem using Perl. Thanx . On Fri, Feb 22, 2013 at 1:38 AM, Mohamed Morsey < > wrote: uHi Gaurav, I recently created a dump for Dutch [1] & Greek [2] so the extractors are working fine and this is probably a configuration issue too. I am not aware of any perl based script, maybe you should ask a Wikipedia or a perl-related mailing list for that Best, Dimitris [1] [2] On Sat, Feb 23, 2013 at 3:40 PM, gaurav pant < > wrote: uHi Morsey/All, I come across this blog where you have placed your comment that problem solved. But how you have fixed the problem is not mentioned. I am facing the same problem during build. Can you please let me know how you have resolved this? Thanks On Sun, Feb 24, 2013 at 12:36 PM, Dimitris Kontokostas < >wrote: uHi Gaurav, On 02/25/2013 06:21 AM, gaurav pant wrote: That was simply a dependency issue, as the live module was depending on \"openanzo\", which was missing. But, the live module does not depend on it anymore. uHi, My errors list is as follows.It seems I am also using \"openanzo\". \" *openanzo[*INFO] uHi Gaurav, AFAICS the \"openanzo\" dependency is totally removed from the whole project, but it seems that you want to use it as you are using \"Sesame\" as the underlying triplestore. On 02/25/2013 08:15 AM, gaurav pant wrote:" "Confusion about DBpedia international datasets and chapters" "uHi everyone, I'm looking to formally describe the structure of the DBpedia datasets and I'm a bit confused about the difference between the main DBpedia download folder ( language dumps available there (for example German: datasets and downloads on the other hand (for example German: If I download a dump from here: open it, choose an arbitrary resource and try to open its URI, the data in the dump and in the LOD of the chapter will not be the same, right? Because in the \"main\" DBpedia download directory is different data than in the chapters download directory (and in turn its SPARQL endpoint and LOD). How does this make sense? There is a DBpedia German 3.9 dataset and a different DBpedia German chapter dataset. I expect the same is true for the other chapters? Why is the chapter's data (at least the dumps) not aggregated at the main DBpedia? I think it is very confusing and hard to understand. It is also impractical. If I want the latest DBpedia for all languages, I have to go to every single chapter, find their download pages (which don't even have the same URI pattern for the different languages), check if their data is more recent than the \"main\" Dbpedia's data, then download it or get the data from the \"main\" download page. Can a procedure be introduced to somehow alleviate this issue? I think conversions of all Chapters. confused regards, Martin uHi Martin, The reason the downloads differ is historical. On the one hand, some of the country chapters were established before the main chapter started extracting anything except english and on the other, the country chapters \"should\" produce dumps more frequently. I agree with you that the country chapter dumps should be mirrored or linked on the main downloads page. We should discuss this issue at the next dbpedia developers telco on Wednesday 02.07.2014 . Cheers, Alexandru On Jun 19, 2014 4:26 PM, \"Martin Brümmer\" < > wrote:" "Dbpedia query inconsistency?" "uDear all, Property functional property? Any instance is the subject of a triple ?instance < one year value? Besides, when I execute next query: SELECT DISTINCT ?prop_theEntity_1 ?data_property ?data_value WHERE { ?prop_theEntity_1 a . FILTER ( ?prop_theEntity_1 = ) OPTIONAL { ?prop_theEntity_1 ?data_property ?data_value. } } ORDER BY ?prop_theEntity_1 ?data_property ?data_value I got two rows describing property productionEndYear http://www.w3.org/2001/XMLSchema#gYear> http://dbpedia.org/page/Porsche_997 defines 2011 and 2011. However if I rewrite query variable names I got only one value for productionEndYear. SELECT DISTINCT ?e ?p ?v WHERE { ?e a . FILTER ( ?e = ) OPTIONAL { ?e ?p ?v. } } ORDER BY ?e ?p ?v http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query;=SELECT+DISTINCT+%3Fe+%3Fp+%3Fv%0D%0AWHERE+%7B%0D%0A%3Fe+a+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FAutomobile%3E+.%0D%0AFILTER+%28+%3Fe+%3D+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FPorsche_997%3E+%29%0D%0AOPTIONAL+%7B+%3Fe+%3Fp+%3Fv.+%7D%0D%0A%7D+ORDER+BY+%3Fe+%3Fp+%3Fv&format;=text%2Fhtml&timeout;=0&debug;=on http://dbpedia.org/resource/Porsche_997 http://dbpedia.org/ontology/productionEndYear\"2011-01-01T00:00:00+02:00\"^^< http://www.w3.org/2001/XMLSchema#gYear> Am I missing something? Regards, Martin Dear all, Property http://dbpedia.org/ontology/productionEndYear shouldn't be a functional property? Any instance is the subject of a triple ?instance < http://dbpedia.org/ontology/productionEndYear > ?year shouldn't have only one year value? Besides, when I execute next query: SELECT DISTINCT ?prop_theEntity_1 ?data_property ?data_value WHERE { ?prop_theEntity_1 a < http://dbpedia.org/ontology/Automobile > . FILTER ( ?prop_theEntity_1 = < http://dbpedia.org/resource/Porsche_997 > ) OPTIONAL { ?prop_theEntity_1 ?data_property ?data_value. } } ORDER BY ?prop_theEntity_1 ?data_property ?data_value http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=SELECT+DISTINCT+%3Fprop_theEntity_1+%3Fdata_property+%3Fdata_value%0D%0AWHERE+%7B%0D%0A%3Fprop_theEntity_1+a+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FAutomobile%3E+.%0D%0AFILTER+%28+%3Fprop_theEntity_1+%3D+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FPorsche_997%3E+%29%0D%0AOPTIONAL+%7B+%3Fprop_theEntity_1+%3Fdata_property+%3Fdata_value.+%7D%0D%0A%7D+ORDER+BY+%3Fprop_theEntity_1+%3Fdata_property+%3Fdata_value&format=text%2Fhtml&timeout=0&debug=on I got two rows describing property productionEndYear http://dbpedia.org/resource/Porsche_997 http://dbpedia.org/ontology/productionEndYear '2011-01-01T00:00:00+02:00'^^< http://www.w3.org/2001/XMLSchema#gYear > http://dbpedia.org/resource/Porsche_997 http://dbpedia.org/ontology/productionEndYear '2012-01-01T00:00:00+01:00'^^< http://www.w3.org/2001/XMLSchema#gYear > http://dbpedia.org/page/Porsche_997 defines 2011 and 2011. However if I rewrite query variable names I got only one value for productionEndYear. SELECT DISTINCT ?e ?p ?v WHERE { ?e a < http://dbpedia.org/ontology/Automobile > . FILTER ( ?e = < http://dbpedia.org/resource/Porsche_997 > ) OPTIONAL { ?e ?p ?v. } } ORDER BY ?e ?p ?v http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=SELECT+DISTINCT+%3Fe+%3Fp+%3Fv%0D%0AWHERE+%7B%0D%0A%3Fe+a+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FAutomobile%3E+.%0D%0AFILTER+%28+%3Fe+%3D+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FPorsche_997%3E+%29%0D%0AOPTIONAL+%7B+%3Fe+%3Fp+%3Fv.+%7D%0D%0A%7D+ORDER+BY+%3Fe+%3Fp+%3Fv&format=text%2Fhtml&timeout=0&debug=on http://dbpedia.org/resource/Porsche_997 http://dbpedia.org/ontology/productionEndYear '2011-01-01T00:00:00+02:00'^^< http://www.w3.org/2001/XMLSchema#gYear > Am I missing something? Regards, Martin" "Galician resources are not available at the wikidata endpoint?" "uHi, Could someone tell me why can't I find Galician resources at SELECT DISTINCT * WHERE { ?wikidata owl:sameAs . OPTIONAL { ?wikidata rdfs:label ?cat . FILTER (lang(?cat) = 'ca') . } OPTIONAL { ?wikidata rdfs:label ?eus . FILTER (lang(?eus) = 'eu') . } OPTIONAL { ?wikidata rdfs:label ?glg . FILTER (lang(?glg) = 'gl') . } OPTIONAL { ?wikidata rdfs:label ?spa . FILTER (lang(?spa) = 'es') . } } LIMIT 100 shows empty data for Galician But a query as SELECT DISTINCT * FROM WHERE { ?wikidata owl:sameAs . } LIMIT 100 works fine at extracted from wikidata dump, but datasets are also avalilable at Regards uHi Miguel, we do not load all language editions with every file into the official DBpedia endpoint. The complete list of files loaded is DBpedia core (e.g. for 2016-04: As you can see there, we only labels for 12 languages: labels_en.ttl.bz2 24-May-2016 07:01 174Mlabels_en_uris_ar.ttl.bz2 07-Jul-2016 10:37 3Mlabels_en_uris_de.ttl.bz2 07-Jul-2016 14:25 13Mlabels_en_uris_es.ttl.bz2 07-Jul-2016 13:28 11Mlabels_en_uris_fr.ttl.bz2 07-Jul-2016 14:19 15Mlabels_en_uris_it.ttl.bz2 07-Jul-2016 12:59 11Mlabels_en_uris_ja.ttl.bz2 07-Jul-2016 12:11 6Mlabels_en_uris_nl.ttl.bz2 07-Jul-2016 12:07 9Mlabels_en_uris_pl.ttl.bz2 07-Jul-2016 12:58 9Mlabels_en_uris_pt.ttl.bz2 07-Jul-2016 12:57 8Mlabels_en_uris_ru.ttl.bz2 07-Jul-2016 13:16 10Mlabels_en_uris_zh.ttl.bz2 07-Jul-2016 10:53 6M This is to save some resources. Maybe we will extend the coverage of labels and abstracts a bit, but this is something Open Link has to green light, since they are hosing the endpoint. As a solution for your use case: If the answer your query, try combining both endpoints in your query with a federated query . Or set up yout own endpoint: Since you are comparing all resources with the equivalent from Wikidata (so it seems), you would get even better results if you set up your own DBpedia endpoint loading the labels_wkd_uris_xy_ttl.bz2 as opposed to the en_urisfiles we loaded in the official endpoint. Those are the labels of a language normalized to existing Wikidata resources, which are usually more numerous than those normalized to English resources. For the query above the following list of datasets loaded would suffice: /xy/labels_wkd_uris_xy.ttl.bz2 /xy/interlanguage_links_xy.ttl.bz2 /wikidata/interlanguage_links_xy_wikidata.ttl.bz2 Where xy would be replaces by the language(s) you are interested in. Setting up an endpoint is quiet easy now with Docker: Have a nice weekend, Markus Freudenberg Release Manager, DBpedia On Thu, Nov 17, 2016 at 6:55 PM, Miguel Solla < > wrote: uHi Markus, Thank you very much for your answer. I don't know if we are talking about the same endpoint 'we do not load all language editions with every file into the official DBpedia endpoint.' I was testing endpoint is Is it the same? In addition, if you've seen the results of the query I sent, they contain labels in Catalan and Basque languages. Regards On 19/11/16 11:01, Markus Freudenberg wrote: uHi Miguel, the two endpoints are not the same. The Wikidata endpoint integrates part of Wikidata to the DBpedia ontology. To know more about this see \"Wikidata through the eyes of DBpedia\" submitted in semantic Web journal. In that endpoint we load textual data from all languages in the mappings wiki Cheers, Dimitris Typed by thumb. Please forgive brevity, errors. On Nov 23, 2016 15:50, \"Miguel Solla\" < > wrote:" "Dewey and other subject classification codes - can these be extracted?" "uVia 'Just noticed Wikipedia has Dewey and LC and OCLC numbers for some books, eg Up In the Air In the source, these show up as {{Infobox Book pages = 303 pp | isbn = 978-0385497107 | dewey= 813/.54 21 | congress= PS3561.I746 U6 2001 | oclc= 46472260 However they're not in What is the procedure for getting this stuff exposed? Something to do with file elsewhere? Are UDC subject codes in there too somewhere? I would love all such bookish classification data to be exposed in dbpedia (and to know how others can help). There is a big library community out there who might be interested, if only they knew how cheers, Dan uOn Mon, Feb 8, 2010 at 9:40 AM, Chris Bizer < > wrote: Excellent, I look forward to it! :) Also wanted to ask about possible changes around Personbut maybe I should wait for that too cheers, Dan uHi Dan, We are still busy with getting all the infobox-to-DBpedia-ontology mappings into a Wiki so that the community can edit and complement them. I hope that this will finished in the next 3 weeks and we will be able to invite the community to add mappings with the next DBpedia release. We will send you a special invite then for checking that we have done everything right with the Infobox Book :-) Cheers, Chris" "(Website) How to add Datasets in DBpedia+ Website improvement" "uHi Diego, the Dataset and Download page is difficult. Currently and more so in the future, we will accept outside datasets and contributions. So the datasets page needs to llook like the like to see and entry for each Current datasets are: DBpedia Core DBpedia Commons DBpedia Wikidata DBpedia Wiktionary (maybe via DBnary.org) some more for sure The Download page would more a description on how to access the data. Statistics might go into All the best, Sebastian On 29.04.2015 09:03, Diego Moussallem wrote:" "Text Categorization with dbpedia" "uHello folks, I want to categorize words according to their subject. For example, if the input is: Porcupine Tree or Pink Floyd I want MUSIC as the category. I was going through the structure of the dbpedia for these pages @ But the category is not too generic. Is there any other way i can get generic categories from the dbpedia tables like 'Music', 'Sports' or 'Technology', etc. If not, what could be the other alternative? Hello folks, I want to categorize words according to their subject. For example, if the input is: Porcupine Tree or Pink Floyd I want MUSIC as the category. I was going through the structure of the dbpedia for these pages @ alternative?" "Mappings.Dbpedia on SMWCon?" "uGreetings to Dbpedia Team and community! I really like Mappings Wiki [1]: the whole idea of users help correcting parser rules is great! Who is the author of the system? What about presenting this side of Dbpedia on SMWCon conference [2]? I'm sure you'll get a lot of new mappers from our audience. Also are there any more interesting projects related to both Dbpedia and semantic wikis? Cheers, Yury Katkov, WikiVote, SMWCon Program Chair [1] [2] SMWCon_Fall_2013 uHi Yury, There are more than one developers to blame for the mappings wiki :) I am not one of them but most (if not all) are based in Berlin so this might be possible. We will have to check and we will report back. There is one more project related to DBpedia and Semantic Wikis, LightweightRDFa [1]. It adds annotations similar to SMW but without all the extra dependencies.Currently we are searching for a public Wiki to deploy LightweightRDFa and then use the DBpedia framework to extract the annotations. Cheers, Dimitris [1] On Tue, Jul 23, 2013 at 10:28 AM, Yury Katkov < >wrote: uSo, will we have the pleasure to meet you? I think that both Mappings and LightweightRDFa are great topics for the conference" "DBpedia Local Storage" "uHello, I'm creating a repository of DBpedia on a local server. I am using Sesame, to populate the datasets (n-triples) in my local repository. However when I'm loading the dataset \"infobox_split_properties\" I encounter the following error: org.openrdf.rio.RDFParseException: '135765 .0 'was not Recognised, and Could not be verified, with datatype datatype / squareKilometre [line 2]. The first thing I did was load the ontology DBpedia 3.8, followed by the dataset \"infobox_instance_types\". Where are these \"datatypes\"? Thanks in advance. Hello, I'm creating a repository of DBpedia on a local server. I am using Sesame, to populate the datasets (n-triples) in my local repository. However when I'm loading the dataset 'infobox_split_properties' I encounter the following error: org.openrdf.rio.RDFParseExcept ion: '135765 .0 'was not Recognised, and Could not be verified, with datatype advance. uHi, There is an existing thread on this issue here [1]. We are looking for contributors, you are welcome to submit a feature request [2] and ideally a fix :) Cheers, Dimitris [1] [2] On Thu, May 2, 2013 at 6:24 PM, David Miranda < >wrote: uOn 03/05/13 03:24, David Miranda wrote: The current version of Sesame (2.7.0) by default raises an error on an unknown or incorrect datatype. You can tweak this behaviour in the Rio parser settings (have a look at the Sesame documentation or ask on the Sesame discussion list for details). By the way, for the next Sesame release, basic validation support for the DBPedia datatypes is planned. See Regards, Jeen uWhat happens if I ignore the data types? Will be missing information? Has anyone done loading the DBpedia dumps with Sesame? 2013/5/2 David Miranda < > uOn 10/05/13 15:31, David Miranda wrote: If you configure the parser to skip datatype verification, the data will be uploaded as-is in your Sesame store. You will not miss any information. However, you do run a slight risk of some queries failing with errors (for example, if a floating-point-typed literal does not have a legal floating point value, querying on it may result in an error). I'm sure many people have used DBPedia data with Sesame (I personally have, certainly). The parser settings have been modified in the latest Sesame release, it's become a bit more strict. I have not yet tried to work with DBPedia data from Sesame 2.7, to be honest. Jeen uOn 11/05/13 12:50, Jeen Broekstra wrote: I should qualify this. Sesame's native store does not scale sufficiently to be able to load the entire DBPedia dataset in it, but you can of course create several stores with different subsets of the data in it, and then use federation support to query over them. Or you can use a third-party Sesame-compatible store that is designed for large datasets. Jeen" "problem creating a mapping" "uHello! I'm trying to create a mapping for an infobox in my language (pt), but I have a problem. If I use this link (* Error Messsage: Stacktrace: com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:997) com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:947) com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:938) com.sun.jersey.server.impl.container.httpserver.HttpHandlerContainer.handle(HttpHandlerContainer.java:187) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:65) sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:65) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:68) sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:555) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:65) sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:527) java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) java.lang.Thread.run(Thread.java:619) I believe that this is happing because I'm using the char '/' on my link. When I changed it to :*Info_Gênero_musical* the error did not occured. My problem is that I need to use the first link in order to correspond the mapping to the right infobox at wikipedia (this one: Does anyone have any idea how can I fix this problem? Thank you. Vânia Hello! I'm trying to create a mapping for an infobox in my language (pt), but I have a problem. If I use this link ( Messsage: Stacktrace: com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:997) com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:947) com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:938) com.sun.jersey.server.impl.container.httpserver.HttpHandlerContainer.handle(HttpHandlerContainer.java:187) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:65) sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:65) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:68) sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:555) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:65) sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:527) java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) java.lang.Thread.run(Thread.java:619) I believe that this is happing because I'm using the char '/' on my link. When I changed it to Vânia uHi Vania, at the moment, I don't know how to fix this, but I would propose a work-around. I think only the Validator has a problem with the slash. But you can save the mapping without a problem. Create a page without a '/', write the mapping and evaluate it. When you are done and the mapping is valid, do not save the page that you just used, copy the mapping code and create a new page. This time use the real title with '/'. Paste the mapping code and save. Best, Max On Thu, Dec 16, 2010 at 12:45 AM, Vânia Rodrigues < > wrote: uHi Max, Thank you for you fast answer. Unfortunately I still have the same problem. I did exactly what you told me to do, but now, when I click on \"test this mapping\", I get the exact same error. I will try to work on other solution, but please if you find the answer for this problem, let me know. Thank you. Vânia 2010/12/16 Max Jakob < > uI finally got around to fix this problem. Template names containing a slash are now no longer a problem for the mappings server. They can now be validated and you can test the mapping with extraction samples, see for example This is especially relevant for writing Portuguese mappings. Cheers, Max On Thu, Dec 16, 2010 at 18:04, Max Jakob < > wrote:" "Data Formatting Issues" "uHello. I've been writing queries to pull in date information for cities and other locations. For most of the data, everything is formatted fine but I have run into some quirks. For example, value of 193019421947 (xsd:double) Looking at value really is 1930, 1942, 1947 I've noticed issues mostly when there are multiple dates for an entity. Here are some other example with built values: 1840-1842 (en) = 1884-1885, 1889 (xsd:integer) (xsd:integer) https://en.wikipedia.org/wiki/Brandon_Plantation_(Halifax_County,_Virginia) = c. 1800, 1842 http://live.dbpedia.org/page/Brandon_Plantation_(Halifax_County,_Virginia) = c. , 1842 (en) There will always be some poorly formatted data but it in these cases it looks like the ingestion process is altering the data. Are there changes to the ingestion process that can be made to fix quirks like this? If I come across other data that is not correct, is it helpful for me to send examples to this listserv? Or is there another way I should be submitting this? Thanks! Jason Here's a query that will return back additional examples. Most are fine but you will see several that are incorrect. SELECT DISTINCT ?placeName ?sameAs ?built WHERE { ?team a dbo:Place; rdfs:label ?placeName; owl:sameAs ?sameAs; dbp:built ?built. FILTER regex(?sameAs, \".*freebase.*\") FILTER (Lang(?placeName)='en') FILTER (strlen(str(?built))>4) } limit 50 Hello. I've been writing queries to pull in date information for cities and other locations. For most of the data, everything is formatted fine but I have run into some quirks. For example, http://live.dbpedia.org/page/Garton_Toy_Company has a dbp:built value of 193019421947 (xsd:double) Looking at https://en.wikipedia.org/wiki/Garton_Toy_Company , I see that the value really is  1930, 1942, 1947 I've noticed issues mostly when there are multiple dates for an entity. Here are some other example with built values: https://en.wikipedia.org/wiki/Mountain_View_(Chatham,_Virginia) = c. 1840-1842 http://live.dbpedia.org/page/Mountain_View_(Chatham,_Virginia) = c. -1842 (en) https://en.wikipedia.org/wiki/Clinton_County_Courthouse_Complex = 1884-1885, 1889 http://live.dbpedia.org/page/Clinton_County_Courthouse_Complex = -18851889 (xsd:integer) https://en.wikipedia.org/wiki/David_Ashbridge_Log_House = 1782, 1970 http://live.dbpedia.org/page/David_Ashbridge_Log_House = 17821970 (xsd:integer) https://en.wikipedia.org/wiki/Brandon_Plantation_(Halifax_County,_Virginia) = c. 1800, 1842 http://live.dbpedia.org/page/Brandon_Plantation_(Halifax_County,_Virginia) = c. , 1842 (en) There will always be some poorly formatted data but it in these cases it looks like the ingestion process is altering the data. Are there changes to the ingestion process that can be made to fix quirks like this? If I come across other data that is not correct, is it helpful for me to send examples to this listserv? Or is there another way I should be submitting this? Thanks! Jason Here's a query that will return back additional examples. Most are fine but you will see several that are incorrect. SELECT DISTINCT ?placeName ?sameAs ?built WHERE {   ?team a dbo:Place;  rdfs:label ?placeName;     owl:sameAs ?sameAs;     dbp:built ?built.   FILTER regex(?sameAs, '.*freebase.*')   FILTER (Lang(?placeName)='en')   FILTER (strlen(str(?built))>4)  } limit 50 uHi Jason & apologies for the delayed reply I think you run into a dbo vs dbp issue in general data under the dbp ( not considered of good quality they are usually extracted using greedy heuristics. Prefer data from the dbo namespace when possible because these are of much higher quality. in case you do not know, you can also improve the data by updating existing mappings at e.g. the pages you mention use the after you improve the mapping you can test it with random pages, e.g. or use your own e.g. since you are working with Live, all the mapping changes will be updated in the data within hours or the latest, in case of errors, 1 month in particular to your example, you are looking at the \"built\" property which, in the above mapping, is mapped to dbo:yearOfConstruction and exists in all your example resources so try with this query SELECT DISTINCT ?placeName ?sameAs ?built WHERE { ?team a dbo:Place; rdfs:label ?placeName; owl:sameAs ?sameAs; dbo:yearOfConstruction ?built. FILTER regex(?sameAs, \".*freebase.*\") FILTER (Lang(?placeName)='en') } limit 50 hope that helps On Sat, Aug 6, 2016 at 2:05 AM, Jason Hart < > wrote:" "Lookup not running http://localhost:1111." "uHello, everyone! I installed Lookup service, but when I use the command: ./run Server dbpedia-lookup-index-3.8 It opens the browser with the address: that talks about Berlin. If I erase the right part of the address leaving only the correct part: the page shows: \"Oops! This link appears broken.\" What may be the problem? Thanks. Hello, everyone! I installed Lookup service, but when I use the command: ./run Server dbpedia-lookup-index-3.8 It opens the browser with the address: Thanks." "Mvn install problems" "uHi all, I checked out the source code from the root repository and I was following the instruction in page I typed mvn install from the extraction However, I received the following error messages. 'repositories.repository.id' must be unique: scala-tools.org -> Can some one explain to me the reason for that? Thanks, Wu Hi all, I checked out the source code from the root repository and I was following the instruction in page I typed mvn install from the extraction However, I received the following error messages. ' repositories.repository.id ' must be unique: scala-tools.org -> Wu ubegin:vcard fn;quoted-printable:Jos=C3=A9 Paulo Leal n;quoted-printable:Leal;Jos=C3=A9 Paulo org;quoted-printable:Universidade do Porto;Departamento de Ci=C3=AAncia de Computadores adr:;;DCC - R. do Campo Alegre, 1021/1055;;;4169-007 PORTO;Portugal email;internet: title:Professor Auxiliar tel;work:+351 220 402 973 x-mozilla-html:FALSE url: version:2.1 end:vcard uI wonder if there is anyway we can use some older version of the source code/binary? I just want to use it to do some experiments in extracting ontology information from my own wiki website. Thanks, Wu 2010/11/30 José Paulo Leal < > uHi, there was a problem with multiple repositories having the same ID. This luckily pointed to the fact that one of them is not required anymore. The POM files are now updated. Can you please try again? Max 2010/11/30 Leon668 < >: uHi, I updated the project from the SVN repository and the previous error disappeared but now I have a different one: Failed to execute goal on project dump: Could not resolve dependencies for project org.dbpedia.extraction:dump:jar:2.0-SNAPSHOT: Failure to find org.dbpedia.extraction:core:jar:2.0-SNAPSHOT in resolution will not be reattempted until the update interval of maven2-repository.dev.java.net has elapsed or updates are forced Here is the maven output zp [INFO] Scanning for projects[WARNING] [WARNING] Some problems were encountered while building the effective model for org.dbpedia.extraction:dump:jar:2.0-SNAPSHOT [WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ org.dbpedia.extraction:main:2.0-SNAPSHOT, /home/zp/Orienta/Vânia_Rodrigues/workspace/extraction/pom.xml, line 73, column 21 [WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-surefire-plugin is missing. @ org.dbpedia.extraction:main:2.0-SNAPSHOT, /home/zp/Orienta/Vânia_Rodrigues/workspace/extraction/pom.xml, line 81, column 21 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING] [INFO] [INFO] u114a115,120 uThanks! 2011/1/6 José Paulo Leal < >" "setting up dbpedia endpoint- isql use" "uHi, I am trying to set up a dbpedia endpoint and host it on a local machine. I have downloaded and installed the Virtuoso open source version as well as the DBpedia dumps. I try to load the dumps as explained here: But I have difficulty with the Bulk Loading script and the use of isql. What packages should I download and install in order to use isql? Hi, I am trying to set up a dbpedia endpoint and host it on  a local machine. I have downloaded and installed the Virtuoso open source version as well as the DBpedia dumps. I try to load the dumps as explained here: isql? u0€ *†H†÷  €0€1 0 + uI have downloaded, compiled and built from source code, and I have the 6.1.4. version of virtuoso. 2012/2/2 Hugh Williams < > uHi Chryssa, I had that problem too, not building from sources but from a virtuoso 6.3 installer. In my system (Ubuntu 10) there is ANOTHER isql. The good one is located in the bin directory where you have installed Virtuoso. u0€ *†H†÷  €0€1 0 +" "Problems with the mappings site" "uHi all, it looks like there are some problems with the mappings website. 1) I have created a mapping for Infobox_comic_book_title [1] adding a few property mappings. Anyway they do not show up in the statistics page [2] 2) If I try to validate any of the mappings page nothing happens (using the Validate button) 3) I have been trying to fix some of the errors that show up in the validation page [3] - fixing the syntax in the ontology class Television Station [4] - fixing some errors in the Infobox college coach [5] But the errors do not go away, and I do not understand why. Do you have any clue? Regards, Andrea [1]: [2]: [3]: [4]: [5]: Hi all, it looks like there are some problems with the mappings website. 1) I have created a mapping for Infobox_comic_book_title [1] adding a few property mappings. Anyway they do not show up in the statistics page [2] 2) If I try to validate any of the mappings page nothing happens (using the Validate button) 3) I have been trying to fix some of the errors that show up in the validation page [3] - fixing the syntax in the ontology class Television Station [4] - fixing some errors in the Infobox college coach [5] But the errors do not go away, and I do not understand why. Do you have any clue? Regards, Andrea [1]: Mapping_en:Infobox_college_coach uHi Andrea, The problem should be fixed now. Can you please check and let us know if we missed something? Best, Dimitris On Sat, Jan 26, 2013 at 1:32 PM, Andrea Di Menna < > wrote: uHi Dimitris, it seems everything is fine now :) Thanks! Andrea 2013/2/7 Dimitris Kontokostas < >" "Botticelli painting : trouble with ( ) in URIs, and more" "uHi there Just discovered that the two following URIs do not actually yield the same description, only the former giving proper description, the latter only types looks like a weird bug. Regarding the content of the \"good\" description Why do we have the \"artist\" property value as text, although it is clearly a link in the infobox at The result is that it's not obvious to discover this work from the artist URI, except akward browsing through categories :( Moreover contains reference to the said painting, does not link to the paintings URIs, only full-text description Seems the same is true for any list in I agree the format of WP lists pages is far from regular, but many of them are flat bullet-point lists which should be easy to parse. On my wishlist for 2010 :) Thanks Bernard uHi, I would like to repeat this issue, I came across it only today. It seems to me that in the current DBpedia data set URIs are stored in encoded form, therefore only the first URI yields the \"correct\" data. This was not the case in the 3.2 and 3.3 releases, which we could verify with our local replicas. This is actually a serious issue, because it practically renders printed DBpedia URIs containing parentheses useless: when readers type them into the webbrowser they usually end up with the second variant, where only limited data is returned. Best, Bernhard uHi Bernard, On 22.01.2010 10:21, Bernard Vatant wrote: The second URI is apparently only available since the YAGO dataset contains decoded links to DBpedia URIs. Therefore there are only those YAGO types visible. The artist property value is only a literal since the infobox for paintings allows users to enter only the artist names and converts them (the whole string) to links. We don't map list elements to our ontology. Also I don't think that from the category itself we could conclude much for the contained list elements. So unfortunately part of this page's text is interpreted as an abstract. Regards, Anja" "Search DBpedia.org - Bad Gateway Workaround" "uNo issue without workaround: you can use Georgi Von: Georgi Kobilarov Gesendet: Di 24.04.2007 13:35 An: ; Betreff: RE: Announcing: Search DBpedia.org Hi all, unfortunately there is an issue with our Apache web server, so you might get a 502 \"bad gateway\". To fix that I have to update Apache to latest version, which can't be done immediately. I hope to get this fixed soon. Georgi" "DBpedia - Querying Wikipedia like a Database:Improveddataset released." "uHi Nick, As you already found out, there are two ways of querying DBpedia: 1. the SPARQL endpoint which can be queried directly or via the SNORQLor OpenLink javascript query builders and 2. the Leipzig Query Builder. You find documentation on how to write SPARQL queries at: More links at As we got lots of traffic after the announcement yesterday, the SPARQL endpoint is a bit unstable this morning. We are busy fixing this. For questions and documentation on how to use the Leipzig Query Builder ( ( ) and Jens Lehman ( ) who have implemented the tool. The queries from the Leipzig Query Builder do not run against the SPARQL endpoint, but against a RDF store in Leipzig. I don't know if this store has already been updated for the new dataset, but I think Sören and Jens will take care of this shortly. All the best, Chris" "How to contribute to the project" "uHi, we would like to know the exact steps to start contributing to this project. We are interested both in coding and in documentation. But we are a bit confused in what regards to the process of contribution. How is it done? Can someone who is in charge of the project give us some insights. Thanks. Your´s trully, Paulo Torres, César Araújo, Raquel Santos Enviado de Correio do Windows" "How to query "is of"" "uHi All, Thanks a lot for your help, Dan. It works fine for me. I have another question, but i'm sorry because this is OOT. Actually i try to access any country details from both dbpedia and cia factbook. I use sparql endpoint : my php code. For dbpedia there is no problem to access the information. However, when i tried to access the factbook, it does not give any result except i access the specific country description by querying the query through the endpoint first. However, when i tried using this endpoint : returned error, but it displays the expected result. Please help me best regards, Hendrik Hi All, Thanks a lot for your help, Dan. It works fine for me. I have another question, but i'm sorry because this is OOT. Actually i try to access any country details from both dbpedia and cia factbook. I use sparql endpoint : Hendrik" "Extract specific properties of a resource" "uHi, I am starting to use dbpedia for a project so my question may be very simple but I couldn't find any resource on this so asking it here. I want to extract a certain set of properties for a given resource. For example, I want to get the country, longitude, latitude and isPartOf properties of a resource of type PopulatedPlace. I know how to extract all the properties of this resource using the following query but this is slow and excessive when i only just want specific properties. Can anyone suggest what is a good solution for this? PREFIX geo: PREFIX dbpedia-owl: PREFIX dbpedia: PREFIX dbpprop: PREFIX rdfs: SELECT ?pl ?lbl ?r ?v WHERE { { ?pl rdf:type dbpedia-owl:PopulatedPlace . ?v rdf:type dbpedia-owl:PopulatedPlace . ?pl rdfs:label ?lbl. ?pl ?r ?v . FILTER(?lbl=\"California\"@en). } UNION { ?pl rdf:type dbpedia-owl:PopulatedPlace . ?v rdf:type dbpedia-owl:PopulatedPlace . ?pl rdfs:label ?lbl. ?v ?r ?pl . FILTER(?lbl=\"California\"@en). } } LIMIT 1000 ​Thanks!​ Ghufran Hi, I am starting to use dbpedia for a project so my question may be very simple but I couldn't find any resource on this so asking it here. I want to extract a certain set of properties for a given resource. For example, I want to get the country, longitude, latitude and isPartOf properties of a resource of type PopulatedPlace. I know how to extract all the properties of this resource using the following query but this is slow and excessive when i only just want specific properties. Can anyone suggest what is a good solution for this? PREFIX geo: < Ghufran uIf you just want some particular properties, then you can specify them with values. E.g., select ?p ?o { values ?p { prop1 prop2 prop3 … } dbpedia:Mount_Monadnock ?p ?o . } On Wed, Jun 4, 2014 at 9:52 AM, Mohammad Ghufran < > wrote: uHello, Thanks for your response. I tried doing this but it gives me an empty result even though I verified that the properties I am looking for exist (even though I am not sure which Prefix to use. For example, I tried : PREFIX geo: PREFIX dbpedia-owl: PREFIX dbpedia: PREFIX dbpprop: PREFIX rdfs: SELECT ?pl ?lbl ?r ?v WHERE { { ?pl rdf:type dbpedia-owl:PopulatedPlace . ?v rdf:type dbpedia-owl:PopulatedPlace . ?pl rdfs:label ?lbl. ?pl ?r ?v . values ?propertyType {dbpedia-owl:capital dbpprop:capital dbpedia-owl:country dbpprop:country} ?r a ?propertyType . FILTER(?lbl=\"California\"@en). } } LIMIT 1000 I was able to get the result with using a filter such that: FILTER (?r a dbpedia-owl:capital || ?r a dbpedia-owl:country) . But I am not sure if it is the best way to do it or even the right way to do it. Mohammad Ghufran On Wed, Jun 4, 2014 at 4:16 PM, Joshua TAYLOR < > wrote: uOn Wed, Jun 4, 2014 at 10:24 AM, Mohammad Ghufran < > wrote: capital and country are properties, not types, and certainly not types of properties. This should be ?pl ?r ?v . values ?r {dbpedia-owl:capital dbpprop:capital dbpedia-owl:country dbpprop:country} instead. uHello, Thank you for the help! It works better ( at least I get some response). I put both dbpprop and dbpedia-owl because I am confused how to access them. To continue my example, the page for California on dbpedia is : as follows: dbpedia-owl:country - dbpedia:United_States ​However, i get different results on DBPedia sparql endpoint and my own local endpoint (which i created by importing all the English language DBPedia files into Virtuoso open source). I get the country using dbpedia-owl:country from DBPedia but i don't get anything for it on my local install. I am clueless about what could be the reason for it. Best Regards, Ghufran On Wed, Jun 4, 2014 at 6:33 PM, Joshua TAYLOR < > wrote: uOn Fri, Jun 6, 2014 at 8:38 AM, Mohammad Ghufran < > wrote: Well, without telling us how you imported the data, and how you're querying, we're just as clueless as you, I'm afraid. :) Be sure to check that the data was loaded properly, and that you've got the right namespace declarations (e.g., dbpedia-owl: is , etc.). //JT uHello, I followed the instructions given here : but with Cygwin since I am using Windows. I installed the Virtuoso Opensource server and imported all the files in the DBPedia english language mirror. I am querying from the sparql query editor normally (but the results are the same if I query using the isql command line tool. I am not sure how to verify if the import was complete or not. I checked the results I got using the query mentioned in the link above and got the following results: SQL> sparql SELECT ?g COUNT(*) { GRAPH ?g {?s ?p ?o.} } GROUP BY ?g ORDER BY DESC 2; Connected to OpenLink Virtuoso Driver: 07.10.3207 OpenLink Virtuoso ODBC Driver g callret-1 LONG VARCHAR LONG VARCHAR 348422644 2939 2639 160 14 5 Rows." "Mapping types at the instance level" "uIt's pretty easy to map some types between dbpedia and freebase; for instance, both systems agree on what a \"Person\" is, so we can say the \"Person\" type is equivalent. If we find that, say, a Yago class contains a list of Persons, then we can say that that the Yago class is a subclass of Person, even the freebase \"Person\". That's all pretty easy. Then there are the types that don't quite align. Dbpedia's \"Organization\" class corresponds most closely to \"Employer\" in Freebase. Employer is a duck type, that is, you'd prove that X is an employer if there some person Y such that X employs Y. Most significant organizations employ, and few non-organizations employ (Is \"Dalia Dippolito\" an employer because she hired a hit man to kill her husband?) There's still the problem that a volunteer FireDepartment is not an employer, but other FireDepartments are. Dbpedia's \"Organization\" which contains FireDepartment, FootballTeam, Business and such, is a more useful type to me than \"Employer\" is. We don't have a good language for describing the relationship between \"Organization\" and \"Employer\". \"Employer is a subclass of Organization\" comes close, but it might not be exactly true. Now, a set of \"Organizations\" could be extracted from Freebase by: (i) discovering freebase types which are subclasses (or near-subclasses) of Organization, and (ii) filtering non-Organizations out, if they exist No we've got examples where: (a) freebase can help dbpedia (finding the missing {ersons), and (b) dbpedia can help freebase (creating a missing type) There are a lot of practical issues in expressing complex type mappings with things like OWL and SKOS, but there is a simple method that works with well-implemented standards. Rather than describing a type mapping, a type mapping can be expressed as a set. For instance, I could go through the dbpedia" "Missing dataset(s)" "uHello guys, I really appreciate dbpedia & the huge amount of information and possibilities that come with this project. But today I got stuck when I was trying to find an entity like \"Richterin Barbara Salesch\". In the german wikipedia there is an article with this title since 2006 (ok, till 2007 it was just a redirect), but I can't find anything like that in dbpedia. I used the SPARQL-interface to find out all entities that mention the word \"salesch\" in any way. The only results I got were the following: How does that come? best regards, Sebastian P.S.: If you are wondering, why I need such strange information: We are working on a project that shall collect & somehow process information about (german) TV-shows uOn 11/8/10 6:49 PM, Sebastian Stange wrote: uAm 09.11.2010 02:24, schrieb Kingsley Idehen: uOn 11/9/2010 1:44 AM, Sebastian Stange wrote: Dbpedia, at the time being, is centered around the English Wikipedia; it does contains assertions that certain pages in other Wikipedias are about the same topic as pages in en.wikipedia, but that's derived from en.wikipedia, not the others. Freebase processes content from en.wikipedia, but not the others. I think there are people who've pointed the dbpedia extraction software at other wikipedias, but I don't think they're publically available. DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" On 11/9/2010 1:44 AM, Sebastian Stange wrote: No, not reallyAs far as I can see, this has only one hit more than my version but does not contain any entity called \"Richterin_Barbara_Salesch\". This is the one I would like to see in the results: Wikipedia; it does contains assertions that certain pages in other Wikipedias are about the same topic as pages in en.wikipedia, but that's derived from en.wikipedia, not the others. Freebase processes content from en.wikipedia, but not the others. I think there are people who've pointed the dbpedia extraction software at other wikipedias, but I don't think they're publically available. uIl 09/11/2010 16.10, Paul Houle ha scritto: We're just working on that, soon we will make a new dataset available. If someone is interested pls contact me. cheers, roberto uHi Roberto & all the others, I found a german dbpedia project that will be sufficient for our purpose. ( Nevertheless I would encourage the project-team to include all specific languages in one unified endpoint ( Thanks to all that wasted a thought on my problem ;) best regards, Sebastian Am 10.11.2010 02:12, schrieb Roberto Mirizzi: Hi Sebastian, our idea is to be plugged into the Linked Data cloud. For this reason it is important to make available a SPARQL endpoint. This is not a problem since if you use a triple store as Virtuoso they offer the endpoint \"for free\". :-) regards, roberto Am 10.11.2010 02:12, schrieb Roberto Mirizzi: uHi, For now i know 3 online multilingual dbpedia projects de.dbpedia.org, ko,dbpedia.org and el.dbpedia.org and from the mappings wiki i can see that there is work going on with other countries as well. there are a few issues, some of them discussed in previous posts (e.g. ) that lead us to form a group (informal for now) that will deal with them we created a page ( post such issues and start getting organized anyone who wishes to participate is very welcome you can contact me (or any of the other contacts) for details regards, Dimitris Kontokostas Il 09/11/2010 16.10, Paul Houle ha scritto: We're just working on that, soon we will make a new dataset available. If someone is interested pls contact me. cheers, roberto Hi, For now i know 3 online multilingual dbpedia projects de.dbpedia.org , ko, dbpedia.org and el.dbpedia.org and from the mappings wiki i can see that there is work going on with other countries as well. there are a few issues, some of them discussed in previous posts (e.g. roberto" "DBpedia Spotlight v0.6 Released (Text Annotation with DBpedia)" "u(apologies for cross posting) Hi all, We would like to announce a maintenance release of DBpedia Spotlight v0.6 - Shedding Light on the Web of Documents. DBpedia Spotlight looks for ~3.5M things of ~320 types in text provided as input, and tries to link them to their global unique identifiers in DBpedia. In this version we have enabled much more flexibility in spotting (added support for keyphrase extraction, NER and other algorithms), including a /spot API and the ability to provide your own spotter results into our disambiguation algorithm. Plus, you can now send a URL, and we will extract the text and annotate it for you. Check it out: Only named entities? More: A new dataset and spotting evaluations were recently published at LREC 2012: Mendes, P.N., Daiber, J., Rajapakse, R., Sasaki, F., Bizer, C. Evaluating the Impact of Phrase Recognition on Concept Tagging. Proceedings of the International Conference on Language Resources and Evaluation, LREC 2012, 21-27 May 2012, Istanbul, Turkey. Mendes, P.N., Jakob, M., Bizer, C. DBpedia for NLP: A Multilingual Cross-domain Knowledge Base. Proceedings of the International Conference on Language Resources and Evaluation, LREC 2012, 21-27 May 2012, Istanbul, Turkey. We are happy to announce that DBpedia Spotlight can now also be used from Apache Stanbol. See: Upcoming for 0.7: elastic indexing on hadoop, topical classification, and new disambiguation strategies. ACKNOWLEDGEMENTS Many thanks to the growing community of DBpedia Spotlight users for your feedback and energetic support. We would like to especially thank: Rohana Rajapakse, Jo Daiber, Hector (Liu Zhengzhong), Max Jakob, Iavor Jelev, Reinhard Schwab, Giuseppe Rizzo for their contributions to our codebase. This release of DBpedia Spotlight was partly supported by The European Commission through the project LOD2 – Creating Knowledge out of Linked Data ( Interactive Knowledge Stack ( HOW TO USE * Demonstration UI: * Call our web service: You can use our demonstration Web Service directly from your application. curl http://spotlight.dbpedia.org/rest/annotate \" "Invalid N-TRIPLES output?" "uHi, I am retrieving data from dbpedia in N-TRIPLES format since my client supports only this serialization format. However I wonder if the output provided by dbpedia is invalid: for instance, when I issue the query DESCRIBE with header \"Accept: text/plain\" [1] the result contains the following line: 395.51 . However according to [2] literals in N-triples must be encapsulated in quotation marks. [1] curl -H \"Accept: text/plain\" [2] Is this a bug/problem in dbpedia or am I getting something wrong? Thanks, and best regards Bernhard uBernhard, Looks like a bug in Virtuoso. I guess a Turtle-ism crept into the N- Triples serializer. As a workaround, you could try sending DBpedia RDF/ XML output through something like triplr.org to obtain valid N-Triples. Best, Richard On 14 Aug 2009, at 09:41, Bernhard Schandl wrote: uHi Bernhard, We are looking into this and shall report back when their is an update Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 14 Aug 2009, at 11:53, Richard Cyganiak wrote: uRichard Cyganiak wrote: Yes, a bug :-( Kingsley" "Russian labels in DBpedia OWL" "uDear list members, I'm happy to salute you in the list devoted to the excited DBpedia project :). I notice that there is no russian labels in DBpedia OWL and therefore (as I suppose) there is no russian view of DBpedia classes and properties. Is it so? And how could I contribute them? uWelcome to the DBpedia community Andrey :) You can edit the ontology labels (as well as the ontology and template mappings) from the DBpedia mappings wiki [1]. You can register and then request an editor account from this mailing list. You can find some useful links for 'ru' here [2] Best, Dimitris [1] [2] On Wed, Oct 30, 2013 at 9:00 PM, Andrey Siver < >wrote:" "Special chars in property URIs (was: DBpedia 3.0 Dataset problems.)" "uJens and all, I'm moving this thread which started off-list to the DBpedia discussion list. It's about weird characters in some infobox dumps, especially the japanese infobox dump. I'll start with a summary of the problem. The names of template variables in some infoboxes have international characters. For example, a german-language infobox might have a template variable \"größe\" (meaning \"size\"). Our current approach to deal with this is to: 1. apply the standard Wikipedia %-encoding to the variable name in order to create a property URI in the namespace, e.g. 2. convert the %XX triplets into _percent_XX because property URIs containing the % character cannot be serialized as RDF/XML. The property URI now looks like this: The problem is that we end up with some very long and ugly property names! Certainly no one would want to use properties like this in a SPARQL query! They also mess up the Pubby HTML view. This hasn't been a big problem when we only did infobox extraction from the English Wikipedia, because it contains very few of those troublesome template names, but it's a huge problem now that we do infobox extraction for many languages. On 15 Feb 2008, at 12:11, Jens Lehmann wrote: I can think of three possible approaches: 1. Simply drop the troublesome triples. If a triple can't be serialized as RDF/XML, just ignore the triple. We considered this for the English infobox extraction, but obviously this is not a good solution for the international infobox dumps which have a lot of those triples. 2. Encode the % character as something shorter than \"_percent_\", e.g. just a simple dash. This is still ugly but at least not so long: The characters we can use without problems are: letters, digits, underscore \"_\" and dash \"-\". 3. Use real international characters in the URI, e.g. I'm not quite sure if this is possible. RDF supposedly supports IRIs (the new style of i18ned URIs that can contain Unicode letters), and XML can certainly use these characters in element names, so it *should* be possible. But this is somewhat uncharted territory, someone would have to dig through the relevant specs to see what exactly is or is not allowed, and from prior experience I would expect a lot of trouble with tools in our toolchain that are not quite Unicode-ready. I guess if we want to go with option 2, then this could be done quickly. Option 3 would probably have to wait for the next release. We may also decide to do 2 now and look into 3 later. What do you all think? What should we do? Richard uHi I definitely want this third approach, not only for properties, but also for the resource names (URIs). %encoded names are totally unusable for non-western language users. Instead, if DBPedia employs IRI for those names, it will be really very, very good news for us ! I have certain amount of experience to use Japanese characters in RDF URIrefs (IRIs), and found most RDF libraries can handle IRIs. Exceptions are characters in compatibility areas in Unicode (e.g. Japanese counterparts to ASCII signs/punctuations etc), which some libraries cannot use as local name of an URI (not necessarily causes errors, but cannot generate correct local name). I'd not be able to involve in programming itself, but happy to provide information as far as I can. best regards, uMasahide, On 18 Feb 2008, at 14:22, KANZAKI Masahide wrote: With regard to resource URIs (as opposed to property URIs), it's worth pointing out that Wikipedia itself uses %-encoded URIs for article pages. I think this is a good reason for DBpedia to also stick with %-encoded URIs, because there is some value in the direct correspondence between Wikipedia URIs and DBpedia URIs, just replace with (Most browsers show the international characters when displaying %- encoded URIs, and automatically do the %-encoding when international chars are entered into the URL bar, so it's easy to forget that Wikipedia also does %-encoding everywhere under the hood.) That's good news. Okay, I do have some questions for you. If you serialize a document with Japanese chars in class or property URIs as RDF/XML, how do the characters show up in XML element names? Do they really show up as Japanese characters in the XML names? Are there examples of RDF vocabularies or ontologies that have Japanese characters in URIs? This would be useful for testing purposes. In your experience, does QName expansion usually work with Japanese characters? One of the main attractions with allowing i18n chars in URIs would probably be that you could write SPARQL queries like this: SELECT * WHERE { people:神崎正英 foaf:interest ?interest . } Does something like that work in practice? Grateful for any information, Richard uHi Richard, 2008/2/19, Richard Cyganiak < >: I don't agree, but this should be another topic (it may go too long ;-) Yes. We can use Japanese characters in XML names, i.e. element names and attribute names without any problems. I publish a few vocabularies that include Japanese names as properties (eg.[1], though I don't see much instances suitable for test, since people tend to use its English-named properties). Yes, no problem. I prepared a small example on my site that includes one property \"$BL>A0(B\" (which means \"name\" in English). Try the following SPARQL, and you will get the result ?nick \"masaka\". PREFIX ex: PREFIX foaf: SELECT ?nick FROM WHERE { ?who ex:$BL>A0(B \"$B?@: (B\"; foaf:nick ?nick. } cheers, [1] doas uSo it sounds more feasible than I thought. Good to know, thanks! Richard On 18 Feb 2008, at 16:59, KANZAKI Masahide wrote:" "Mappings namespace requests" "uHi Angel, Miguel, Uldis, Thanks for your interest in the DBpedia internationalization effort! It would be great if all of you could join the monthly developers telco. The next is scheduled for tomorrow, 2 p.m. CET: WRT new mappings namespaces, we currently add them when multiple requests are gathered, as the process is unfortunately quite time consuming. I think we have now reached a sufficient amount of requests, and we'll discuss about that in the telco. Looking forward to talking to you tomorrow. Cheers, uHi Marco, I will be glad to join the telco. Cheers, Uldis On 7 July 2015 at 13:55, Marco Fossati < > wrote: uHi all, I also be there See you, Miguel On 08/07/15 13:53, Uldis Bojars wrote:" "Problem with person classes" "uHi, I'm currently trying to make use of the yago-Dataset, especially of the links from Wikipedia categories to wordnet \"concepts\". That way we could link people within the category \"American tennis players\" and \"German tennis players\" to the concept \"tennis player\", which would be sub-concept of \"person\". But I stumbled on the following problem: there is no concept \"tennis player\" in dbpedia, because the Wikipedia article Same problem with e.g. chess_player, novelist, humorist, painter, philosopher Any suggestions? Cheers, Georgi uGeorgi Kobilarov wrote: Georgi, We can fix Wikipedia :-) We can make some changes that could then trigger an avalanche ('network effects') of changes across Wikipedia. Note we can contact Wikipedia about edit bots for this kind of edit etcwhich would reduce the labor intensity of this effort. uKingsley, dbpedia has indeed the potential to show up weak points in the Wikipedia dataset. But automatic article-editing with bots etc. is something the Wikipedia community strictly refuses. We (Chris, Richard and me) had a meeting with Jakob Voss, who is member of the board of Wikipedia Germany. He suggested using automated creation of lists with \"edit requests\" that could be processed manually by Wikipedia users. Some users do have client-side tools supporting the revision-process of articles. But it has to be at least a half-manual process. @Jakob: Is there a reason for not having articles like \"tennis player\" etc inserted into Wikipedia? Cheers Georgi Von: Kingsley Idehen [mailto: ] Gesendet: Mo 19.03.2007 16:54 An: Georgi Kobilarov Cc: Betreff: Re: [Dbpedia-discussion] Problem with person classes Georgi Kobilarov wrote: Georgi, We can fix Wikipedia :-) We can make some changes that could then trigger an avalanche ('network effects') of changes across Wikipedia. Note we can contact Wikipedia about edit bots for this kind of edit etcwhich would reduce the labor intensity of this effort. uOn 19 Mar 2007, at 17:16, Georgi Kobilarov wrote: I think “strictly refuses” is too strong a term. “Frowns upon” might be more accurate. A lot of work went into most articles, they are a very carefully balanced compromise between many people with opposing viewpoints. It's understandable that authors will be unhappy about some bot blundering through thousands of articles. Let me guess. What would the Tennis Player article say? Much of the same as the Tennis article? No, repetition is bad, even in Wikipedia. So: “A person who practices [[Tennis]].” But such an article is a bit pointless. A redirect makes more sense. The reader is directed to a page that gives her all information needed to understand what a tennis player is. Wikipedia is designed for human readers. It's an encyclopedia, not an attempt at domain modelling. And shouldn't Tennis Player be a category anyway? Richard uRichard Cyganiak schrieb: I agree and it exists: Note however, that YAGO only picks leaf categories in Wikipedia and makes them subclasses of concepts in WordNet. Whether or not a class tennis player exists in YAGO, depends on WordNet (and the YAGO algorithm). Jens PS: We are working on a class hierarchy close to Wikipedia, but this will take some weeks." "image url hash keys incorrect?" "uI'm using ImageExtractor#getImageUrl in the extraction_framework to get the url of an image. val md = MessageDigest.getInstance(\"MD5\") val messageDigest = md.digest(fileName.getBytes) val md5 = (new BigInteger(1, messageDigest)).toString(16) val hash1 = md5.substring(0, 1) val hash2 = md5.substring(0, 2); val urlPart = hash1 + \"/\" + hash2 + \"/\" + fileName Most of the time, the function works correctly but on a few cases, it is incorrect: For \"Stewie_Griffin.png\", I get 2/26/Stewie_Griffin.png but the real one is 0/02/Stewie_Griffin.png The source file info is here: Any ideas why the hashing scheme doesn't work sometimes? uThanks to the folks on the wikipedia api mailing list, the problem was that the leading zero was being eaten. This will fix it in ImageExtractor#getImageUrl:    val result = (new BigInteger(1, messageDigest)).toString(16) val md5 = if (result.length % 2 != 0) \"0\" + result else result I would submit a patch but i'm unsure how to do so. On Sat, Dec 3, 2011 at 6:38 PM, Tommy Chheng < > wrote: uThis solution is also flawed. Check with Batman_Kane.jpg I recommend using org.apache.commons.codec.digest.DigestUtils#md5Hex Relying on a commonly used library is a lot less bug prone. On Mon, Dec 5, 2011 at 4:17 PM, Tommy Chheng < > wrote: uHi Tommy, On 12/07/2011 12:13 AM, Tommy Chheng wrote: I've already spotted the problem you mentioned and fixed it in our live instance which is available at \" During its run DBpedia-Live fixes more articles as it encounters them, so you will not find foaf:depiction predicate for all articles, but by time more and more will have their corresponding foaf:depiction predicates. We will include that fix also in the next release of DBpedia. Please, have a look on it and send me any feedback you have about it." "Accuracy of coordinates in dbpedia/wikipedia & freebase" "uI've recently put up a site that uses coordinate information from Freebase and Dbpedia, and I'm starting to think about how to clean up certain data quality problems I'm encountering, for instance, see: In this particular case, I've only got data from dbpedia, which drops the point a few hundred km from where it really is It's obvious that this is a bad one because it's right in the middle of Lake Erie. Freebase doesn't have any coordinate for this thing (seems to me that it should), and at the moment, Wikipedia has the right coordinates (at least on Google maps I see a big factory building) My guess is that wikipedia might have been wrong at one time, and has had it corrected. It's also possible that the conversion wasn't done right in dbpedia, since coordinates are represented differently in a few hundred different infoboxes. It seems to me that both the number of points and the quality of points in Wikipedia has been improving dramatically over the last two years About a year ago I plotted the points for Staten Island Railroad stations and found that the railroad was displaced a few km east and ran right under the middle of the Tapan Zee bridge Now it's much better. I can find examples where: (a) dbpedia is right and freebase is wrong (for instance, a town in continental Europe gets its longitude sign flipped and ends up with the wrecked ships west of the UK uOn 9/27/10 4:42 PM, Paul Houle wrote: uIl 28/09/2010 00:00, Kingsley Idehen wrote: uOn 9/28/10 3:43 AM, Roberto Mirizzi wrote: uHello Paul, On 27.09.2010 22:42, Paul Houle wrote: You can check this by comparing the regular DBpedia and DBpedia Live: Indeed, the coordinates have changed, so there was probably an error in Wikipedia. Consider using LinkedGeoData [1]. As member of both projects (DBpedia and LinkedGeoData) I am quite sure that the latter has more accurate coordinates. LinkedGeoData also has DBpedia links, although they need (and will be) updated, because of some changes in LinkedGeoData. Note that for large objects (e.g. Russia and other countries) both LinkedGeoData/OpenStreetMap and DBpedia/Wikipedia contain a reference point, which is a good representative for this object. However, there is no strict mathematical definition how to compute it, which means that reference points in LinkedGeoData and DBpedia do not necessarily coincide. Kind regards, Jens [1] http://linkedgeodata.org uIl 29/09/2010 10:10, Jens Lehmann ha scritto: imho I don't think that comparing DBpedia with its live version is the solution. In addition to having to double-check each resource, if something is different in the live version, it does not mean that surely there was an error in the previous version: the error could be in the live version. I remember some time ago, I was looking for the DBpedia live page about the programming language PHP, and the abstract reported: \"PHP: Problèmes d'Hygiène Personnelle\". :-) Unfortunately vandalism is quite widespread in WIkipedia. Probably, at least nowadays, there are better datasets than DBpedia that offer more accurate geolocation, as you pointed out with LinkedGeoData. Besides the problem of geographical coordinates, as I mentioned in the previous message, I think it's a problem doing \"reasoning\" on resources belonging to two different datasets, linked through owl: sameAs, and having inconsistent information. Probably reasoning on Linked Data (I mean, not a single dataset) is not yet completely mature." "DBPedia Arabic Support" "uHi Aya, I'm afraid not, I don't even know what these entities are. Could you give us an example? I'm guessing something like &arabic;? I know of one such entity in MediaWiki, the right-to-left marker. DBpedia doesn't understand that yet and I don't know how much effort it might be to implement it. Regards, Christopher On Tue, Mar 27, 2012 at 20:07, Aya Zoghby < > wrote: uOn 27 March 2012 19:07, Aya Zoghby < > wrote: Do you mean, is there data available that has been extracted from the Arabic Wikipedia? If so, you can find it here: There are currently no mappings available for Arabic: if you can find some volunteers to help in mapping the templates, I'm sure the team will be glad to help you get started. Also, the code has not been localised to Arabic, which is needed to extract certain kinds of information. I've made an attempt to find the marker for disambiguation pages. Does the attached patch look correct to you? uJimmy, Aya, Actually, Arabic mappings were started just recently. Currently there are four: Your patch looks good to me, I'll apply it soon. I recently stumbled upon a page listing all disambig templates used on a wikipedia. It's name is MediaWiki:Disambiguationspage and it's active on most wiki, for example Arabic: Maybe you could extend your patch to include these? Cheers, Christopher On Wed, Mar 28, 2012 at 17:19, Jimmy O'Regan < > wrote: uOn 28 March 2012 18:21, Jona Christopher Sahnwaldt < > wrote: I've done it, but it's a separate patch - I don't use Mercurial much, and have no idea how to flatten a changeset. I've also updated Polish and Irish." "Remaining problem with URI access?" "uHello again Clearly the 4096-byte-truncation problem has been fixed. Many thanks for that. But there's another problem, which I think I saw a few times last week and which now seems fairly persistent, on some subset of the pages. For example, when I try to access or I get the following error from Apache (apologies for any messy line-wrapping): HttpException: HttpException: 500 SPARQL Request Failed: rethrew: HttpException: 500 SPARQL Request Failed com.hp.hpl.jena.sparql.engine.http.HttpQuery.execCommon(HttpQuery.java:273) com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:167) com.hp.hpl.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:128) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execModel(QueryEngineHTTP.java:101) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execDescribe(QueryEngineHTTP.java:95) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execDescribe(QueryEngineHTTP.java:93) de.fuberlin.wiwiss.pubby.RemoteSPARQLDataSource.execDescribeQuery(RemoteSPARQLDataSource.java:68) de.fuberlin.wiwiss.pubby.RemoteSPARQLDataSource.getResourceDescription(RemoteSPARQLDataSource.java:51) de.fuberlin.wiwiss.pubby.servlets.BaseServlet.getResourceDescription(BaseServlet.java:66) de.fuberlin.wiwiss.pubby.servlets.PageURLServlet.doGet(PageURLServlet.java:31) de.fuberlin.wiwiss.pubby.servlets.BaseURLServlet.doGet(BaseURLServlet.java:33) de.fuberlin.wiwiss.pubby.servlets.BaseServlet.doGet(BaseServlet.java:94) javax.servlet.http.HttpServlet.service(HttpServlet.java:689) javax.servlet.http.HttpServlet.service(HttpServlet.java:802) I have no insight into which pages are subject to this error; I've just found a few by chance. If the server is behaving the same for everyone, please bear in mind that the Tetris page is one of the examples offered at Bye for now Aran uAran Lunzer wrote:" "Index of terms" "uHi! Is there an index of dbpedia terms in Swedish and in English somewhere? I work in a project where we would like to link our Swedish keywords to corresponding dbpedia terms with explanations. But we can't find any way to overview all the dbpedia terms. Does anyone know how to see them all in alphabetical order or by topic? Regards, Alma Taawo Hi! Is there an index of dbpedia terms in Swedish and in English somewhere? I work in a project where we would like to link our Swedish keywords to corresponding dbpedia terms with explanations. But we can't find any way to overview all the dbpedia terms. Does anyone know how to see them all in alphabetical order or by topic? Regards, Alma Taawo uWell, there's . It contains triples like . I don't know if that's what you're looking for. Regards, Christopher On 7 August 2013 19:36, Alma Taawo < > wrote:" "what is void:statItem in VoID description of Dbpeida?" "uHello! I found this resource in CKAN desciption of DBpedia It has the poperty what I need. Though I cannot find this property in VoID vocabulary description. Is that a mistake? Sincerely yours, uNo mistake. Just deprecated. See [1] for more information. Cheers, Michael [1] #statistics" "A detailed analysis of the ArchitecturalStructure portion of the DBpedia ontology, proposals included" "uWikipedia defines building as \"a man-made structure with a roof and walls standing more or less permanently in one place\" and \"intended for human use or occupation\" [ suggests examining examples of non-building structures, where are listed the Eiffel Tower, the Golden Gate Bridge, roller coasters, water towers, gates, fortifications, and transmission towers. However, DBpedia includes RollerCoaster, WaterRide, Gate, MilitaryStructure, Treadmill, and WaterTower as subclasses of Building. Each of these are either explicitly listed as non-building structures or are obviously non-building structures given the exclusions in Wikipedia. It turns out that many of the classes that are subclasses of Building in the DBpedia ontology are either disjoint from Building or are not truely subclasses of Building. Similar problems affect ArchitecturalStructure. The definition of architectural structure is unclear, but neither public transit systems nor parks are architectural structures at all, except possibly in very unusual cases, even though these are subclasses of ArchitecturalStructure in the DBpedia ontology. Here is my in-depth analysis of ArchitecturalStructure and its subclasses, building on the observations above. I have used the appropriate pages from English Wikipedia and instances from DBpedia 3.9 to back up my arguments. ([NC] ([NEC]) means no (English) comment on the class.) ArchitecturalStructure [NC] Building is a subclass of Architectural Structure. However, the Infobox Wikipedia defines building as \"a man-made structure with a roof and walls standing more or less permanently in one place\" and \"intended for human use or occupation\" [ suggests examining examples of non-building structures, where are listed the Eiffel Tower, the Golden Gate Bridge, roller coasters, water towers, gates, fortifications, and transmission towers. However, DBpedia includes RollerCoaster, WaterRide, Gate, MilitaryStructure, Treadmill, and WaterTower as subclasses of Building. Each of these are either explicitly listed as non-building structures or are obviously non-building structures given the exclusions in Wikipedia. It turns out that many of the classes that are subclasses of Building in the DBpedia ontology are either disjoint from Building or are not truely subclasses of Building. Similar problems affect ArchitecturalStructure. The definition of architectural structure is unclear, but neither public transit systems nor parks are architectural structures at all, except possibly in very unusual cases, even though these are subclasses of ArchitecturalStructure in the DBpedia ontology. Here is my in-depth analysis of ArchitecturalStructure and its subclasses, building on the observations above. I have used the appropriate pages from English Wikipedia and instances from DBpedia 3.9 to back up my arguments. ([NC] ([NEC]) means no (English) comment on the class.) ArchitecturalStructure [NC] Building is a subclass of Architectural Structure. However, the Infobox building is too broadly used in English Wikipedia, e.g., mapping for Infobox building to check for special cases, such as building_type including \"tower\", and map them into the appropriate classes. This is the most complex proposal here because of the overuse of the building infobox. [NC] AmusementParkAttraction is excluded implicitly from building. There are no instances of this class that I could find. Proposal: Move the class to be a direct subclass of ArchitecturalStructure. RollerCoaster is excluded explicitly from building. No further change is needed for this class. [NC] WaterRide is excluded implicitly from building. No further change is needed for this class. [NC] Arena is excluded explicitly from building, presumably because of open-air arenas such as populated directly from the nl:Infobox stadion. Proposal: Move the class to be a direct subclass of ArchitecturalStructure and make no changes to the mapping rule. [NC] Gate is excluded implicitly from building. Proposal: Move the class to be a direct subclass of ArchitecturalStructure. HistoricBuilding needs no change. [NC] MilitaryStructure is not a subclass of building, as it includes fortifications. The class is also used for military bases, which are not even architectural structures, e.g., incorrectly use the Infobox military structure. Proposals: Move the class to be a direct subclass of ArchitecturalStructure. Fix mappings for this infobox to check for type Military base, and then instead use type Place. The comment on the class needs to be changed to conform with architectural structure. Mill is not a subclass of building as many mills have no enclosure and are thus are not buidings. The only mills listed are in the Dutch Wikipedia, and are mostly buildings or at least building-like. One exception is http://nl.dbpedia.org/resource/De_Meent_(Langerak). Proposal: Make this class a direct subclass of ArchitecturalStructure. Even though most mills currently in DBpedia could be considered to be buildings, there will be very little lost here. [The comment needs to be changed to broaden to its actual use.] Treadmill, Watermill, WindMotor, and Windmill should remain as subclasses of Mill and thus need no changes besides being removed from Building. ReligiousBuilding the physical presence of many religious congregations are only parts of buildings, but it appears that in DBpedia only includes those meeting places for religious congregations that are buildings. Monastery is not a subclass of building because many monasteries include multiple buildings and other structures, and are thus not even architectural structures, e.g., http://fr.dbpedia.org/resource/Gndevank and http://nl.dbpdia.org/resource/Makarjevklooster. Instances of this class are populated from Infobox Édifice religieux, checking for type Monastère. Dutch instances of this class are populated directly from nl:Infobox klooster. Proposal: Make this a direct subclass of Place. Some information will be lost for monasteries that are single buildings, but this is a minor loss. Abbey is not a subclass of building because some abbeys include multiple buildings and other structures, and are thus not even architectural structures, e.g., http://fr.dbpedia.org/resource/Abbaye_de_Lucedio Proposal: Make this a direct subclass of Place. Some information will be lost for abbeys that are single buildings, but this is a minor loss. Church is not a subclass of Building. \"A Church is a religious institution, place of worship, or group of worshipers, usually Christian\" http://en.wikipedia.org/wiki/Church. However, all (or almost all) churches in DBpedia are religious buildings, mostly free-standing, as only important Churches have been placed in DBpedia. Proposal: Do not make any change, except putting a comment in the DBpedia ontology that this class is for church buildings only. It would be better to rename this class as something like \"Church_(building)\", but I do not know if this change is worth it. [NC] Mosque is in the same situation as Church. Synagogue is in the same situation as Church. Temple is only populated in Japanese DBpedia, and is likely in the same situation as Church. [NC] Shrine is not a subclass of Building as almost all shrines are not buildings, being simple and small. Proposal: Make this a direct subclass of ArchitecturalStructure. Skyscraper needs no change. [NC] Tower is not a subclass of Building, as many towers have no interior space. Proposal: Move the class to be a direct subclass of ArchitecturalStructure. [The comment needs to be fixed.] Lighthouse is not a subclass of Building, even though many lighthouses are buildings or building-like. \"A lighthouse is a tower, building, or other type of structure\" [http://en.wikipedia.org/wiki/Lighthouse]. Proposal: Leave as as direct subclass of Tower, which will move it out of Building. [NC] WaterTower is excluded explicitly from building. Proposal: Leave as as direct subclass of Tower, which will move it out of Building. Venue is not a subclass of Building as many venues are open air, e.g., http://dbpedia.org/resource/Hollywood_Bowl, or parts of buildings. Venue is populated directly from Infobox venue. Proposal: Move the class to be a direct subclass of ArchitecturalStructure. [NC] Theatre is not a subclass of Building as some theatres are open air, e.g., http://dbpedia.org/resource/Delacorte_Theater, and other theatres are parts of larger buildings or structures e.g., http://dbpedia.org/resource/Circle_in_the_Square_Theatre. Proposal: Leave as a direct subclass of Venue, which will move it out of Building. [NC] Casino, Castle, Factory, Hospital, Hotel [NC], Museum [NC], Prison [NC], Restaurant [NC], ShoppingMall [NC] are all functional facilities, which are often either complexes of buildings and other structures, e.g. http://dbpedia.org/resource/Toronto_General_Hospital, or parts of buildings or other structures, e.g., http://dbpedia.org/resource/Casino_Lisboa,_MacauRobuchon_á_Galera1 Proposal: Move all these to be direct subclasses of Place. Dam needs no change. [The comment needs to be adjusted.] Garden is not a subclass of ArchitecturalStructure as many gardens are just plantings of flowers with no associated structure. Proposal: Move this class to be a direct subclass of Place. [NC] Infrastructure needs no change [NC] Airport is questionable as a subclass of ArchitecturalStructure, but probably needs no change. [NC] LaunchPad [NC], Lock [NC], PowerStation [NC], and NuclearPowerStation [NC] need no change. RouteOfTransportation [NC] is questionable as a subclass of ArchitecturalStructure, but probably needs no change. Bridge needs no change. [NC] PublicTransitSystem is not a subclass of RouteOfTransportation or even ArchitecturalStructure, e.g., http://dbpedia.org/resource/Massachusetts_Bay_Transportation_Authority. Proposal: Make this class a direct subclass of Organisation. [NEC] RailwayLine, Road [NC], and RoadJunction [NC] need no change. [The commment on RailwayLine probably needs to be adjusted.] Tunnel is not a subclass of RouteOfTransportation as not all tunnels are routes of transportation, e.g., http://dbpedia.org/resource/Tunnel_of_Eupalinos. Proposal: Make this class a direct subclass of ArchitecturalStructure. RailwayTunnel [NC], RoadTunnel [NC], and WaterwayTunnel [NC] are all subclasses of RouteOfTransportation. Proposal: Make these classes subclasses of both Tunnel and RouteOfTransportation. Station, MetroStation [NEC], and RailwayStation [NC] need no change. NoteworthyPartOfBuilding needs no change. Park is not a subclass of ArchitecturalStructure as many parks are mostly natural areas, e.g., http://dbpedia.org/resource/Deception_Pass. Proposal: Make this class a direct subclass of Place. [NC]" "Generate nerd_stats_output.tsv from nerd-stats.pig" "uGood afternoon, I'm installing a mirror of the web service of the DBPedia, so i'm trying to generate nerd_stats_output.tsv from nerd-stats.pig . I've installed pig and when i run pig -x local nerd-stats.pig it gives me na error of the parameter LANG inside the script. The problem is that i dont know what the parameter is. Can you help me? It's urgent. Thank you. Good afternoon, I'm installing a mirror of the web service of the DBPedia, so i'm trying to generate nerd_stats_output.tsv from nerd-stats.pig . I've installed pig and when i run pig -x local nerd-stats.pig it gives me na error of the parameter LANG inside the script. The problem is that i dont know what the parameter is. Can you help me? It's urgent. Thank you." "Two questions on Spotlight" "uDear all, I have a couple of questions on DBpedia Spotlight. - What means \"n-best candidates\"? I tried to check it, thinking to get as an answer a list of entities for each span, but I obtain only one (with a confidence-like score). - What is the \"confidence\"? I thought that the value shown with the \"n-best candidates\" was the confidence, but I obtain this weird result, as follows. 1. - Leave the text of the demo. - Leave the default confidence (0.5). - Check \"n-best candidates\". - Annotate You will see that the first word, \"First\" is not linked. 2. - Leave the text of the demo. - Set the confidence to 0.1. - Check \"n-best candidates\". - Annotate You will see that the first word, \"First\", now is linked to WWI with confidence 1. Is it normal? Best, Alessio uHi Alessio, On Wed, May 27, 2015 at 1:05 PM, Alessio Palmero Aprosio < > wrote: It is the list of candidate entities that will be disambiguated The linking takes place in two stages : spotting & disambiguating. Spotting matches text to surface forms. Disambiguating chooses one topic among potential candidates (n-best candidates). As it currently stands the confidence parameter refers both to the spotter and the disambiguator. So that parameter is used to prune potential spots as well as potential topics. My guess is that by lowering the confidence you allowed the spotter to get an extra surface form match ( \"First\")." "Metadata for publicdomainworks.net" "uHi all, I write to you on behalf of the Public Domain Working Group of the Open Knowledge Foundation. We are currently working on enhancing and expanding the collection of works available at www.publicdomainworks.net - a registry of artistic works that are in the public domain. In order to do that, we need to gain access to a maximum of information concerning works of different kinds (in particular, the author, the date of birth and death, the date of publication, etc). While we already have a proper database for bibliographic works, we are interested in getting metadata for as many different kinds of works as possible. We are currently trying to figure out where we can find metadata about works other than books or other written publications. In particular, I would like to know whether you do already have a collection of metadata for different kinds of works (images, photographs, paintings, sound recordings, video recordings, etc) and/or who would you suggest we contact in order to obtain the metadata we are looking for. Looking forward to your answer, Primavera Hi all, I write to you on behalf of the Public Domain Working Group of the Open Knowledge Foundation. We are currently working on enhancing and expanding the collection of works available at www.publicdomainworks.net - a registry of artistic works that are in the public domain. In order to do that, we need to gain access to a maximum of information concerning works of different kinds (in particular, the author, the date of birth and death, the date of publication, etc). While we already have a proper database for bibliographic works, we are interested in getting metadata for as many different kinds of works as possible. We are currently trying to figure out where we can find metadata about works other than books or other written publications. In particular, I would like to know whether you do already have a collection of metadata for different kinds of works (images, photographs, paintings, sound recordings, video recordings, etc) and/or who would you suggest we contact in order to obtain the metadata we are looking for. Looking forward to your answer, Primavera" "ask help from a dbpedia programmer" "uhello: I got error messages as follows: HttpException: HttpException: 500 SPARQL Request Failed: HttpException: 500 SPARQL Request Failed com.hp.hpl.jena.sparql.engine.http.HttpQuery.execCommon(HttpQuery.java:340) com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:190) com.hp.hpl.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:147) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:131) eMSE.DBPedia.searchOnDBPedia(DBPedia.java:66) eMSE.SemanticSearcher.doGet(SemanticSearcher.java:44) javax.servlet.http.HttpServlet.service(HttpServlet.java:689) javax.servlet.http.HttpServlet.service(HttpServlet.java:802) When excuting the following sparql query code through \"QueryExecutionFactory.sparqlService(\" PREFIX : PREFIX rdfs: PREFIX foaf: PREFIX dbpedia: PREFIX owl: PREFIX rdf: SELECT DISTINCT ?label ?abstract ?ref ?type WHERE { ?x rdfs:comment ?abstract ; rdf:type ?type ; ?ref ; rdfs:label ?label . FILTER regex(?label, \"semantic\", \"i\") } ORDER BY ?label what is the problem? how to solve that? many thanks! uhello: i got The source code is as follows: PREFIX : PREFIX rdfs: PREFIX foaf: PREFIX dbpedia: PREFIX owl: PREFIX rdf: SELECT DISTINCT ?label ?abstract ?ref ?type WHERE { ?x rdfs:comment ?abstract ; rdf:type ?type ; ?ref ; rdfs:label ?label . FILTER regex(?label, \"semantic\", \"i\") } ORDER BY ?label many thanks!" "Issue with table mappings" "uHi, I have been trying to run the Dbpedia Extraction Framework (DEF) on en wikipedia dump. Unfortunately DEF page extraction threads kept crashing with memory issues. After various experiments and help from DEF mailing list, I was able to narrow it down to few particular pages and the mappingExtractor. I was pointed to look at the table mappings. It turns out that when we changed the code to ignore the table mappings coming from dbpedia.org, the code runs fine. There are currently only a couple of table mappins present @dbpedia . My question is whether anyone has idea about this problem? Is the problem with dbpedia code or the mappings ? Also, how important are these table mappings ? Given the memory problems and small number of mappings we are thinking of going ahead without the table mappings for my work. What effect will this have on the DEF output ? Thanks in advance Regards Amit kumar Tech Lead @ Yahoo! Sent from Samsung Mobile Hi, I have been trying to run the Dbpedia Extraction Framework (DEF) on en wikipedia dump. Unfortunately DEF page extraction threads kept crashing with memory issues. After various experiments and help from DEF mailing list, I was able to narrow it down to few particular pages and the mappingExtractor. I was pointed to look at the table mappings. It turns out that when we changed the code to ignore the table mappings coming from dbpedia.org, the code runs fine. There are currently only a couple of table mappins present @dbpedia . My question is whether anyone has idea about this problem? Is the problem with dbpedia code or the mappings ? Also, how important are these table mappings ? Given the memory problems and small number of mappings we are thinking of going ahead without the table mappings for my work. What effect will this have on the DEF output ? Thanks in advance Regards Amit kumar Tech Lead @ Yahoo! Sent from Samsung Mobile uHi, I have been trying to run the Dbpedia Extraction Framework (DEF) on en wikipedia dump. Unfortunately DEF page extraction threads kept crashing with memory issues. After various experiments and help from DEF mailing list, I was able to narrow it down to few particular pages and the mappingExtractor. I was pointed to look at the table mappings. It turns out that when we changed the code to ignore the table mappings coming from dbpedia.org, the code runs fine. There are currently only a couple of table mappins present @dbpedia . My question is whether anyone has idea about this problem? Is the problem with dbpedia code or the mappings ? Also, how important are these table mappings ? Given the memory problems and small number of mappings we are thinking of going ahead without the table mappings for my work. What effect will this have on the DEF output ? Thanks in advance Regards Amit kumar Tech Lead @ Yahoo! Sent from Samsung Mobile Hi, I have been trying to run the Dbpedia Extraction Framework (DEF) on en wikipedia dump. Unfortunately DEF page extraction threads kept crashing with memory issues. After various experiments and help from DEF mailing list, I was able to narrow it down to few particular pages and the mappingExtractor. I was pointed to look at the table mappings. It turns out that when we changed the code to ignore the table mappings coming from dbpedia.org, the code runs fine. There are currently only a couple of table mappins present @dbpedia . My question is whether anyone has idea about this problem? Is the problem with dbpedia code or the mappings ? Also, how important are these table mappings ? Given the memory problems and small number of mappings we are thinking of going ahead without the table mappings for my work. What effect will this have on the DEF output ? Thanks in advance Regards Amit kumar Tech Lead @ Yahoo! Sent from Samsung Mobile uHi, I think the only effect is that you will miss some data on automobiles, as there is but the rest will be there. Minor note: You mentioned that there are a couple of mappings, but I only see one: Do you have a hint on which article it crashes? In case you need that data and someone fixes the extractor at some point (unfortunately I won't have time for that), it would also be possible to run it stand alone in order to generate the missing data at a later point (saves a bit of time compared to doing a full dump). Cheers, Claus On 01/16/2012 10:55 AM, Amit Kumar wrote: uWikipedia |- ! Engine !! Horsepower !! Torque (lb.-ft.) !! Redline !! Displacement (L / cu in) !! Fuel !! Bores !! Stroke !! Compression Ratio !! MPG City / Hwy / Combined |- | LS7|| 505 @ 6200 || 475 @ 4800 || 7000 || 7.0 || 91|| 4.125' || 4.00' || 11.0:1 |- | LS2|| 400 @ 6000|| 400 @ 4400|| || 6.0 || 93 || 4.00'|| 3.62'|| 10.9:1 || 19/28/23 |- | LS9 || 638 @ 6500|| 604 @ 3800|| Redline || 6.2 || Fuel || Bores || Stroke || Compression || MPG |- | LS1|| Horsepower || Torque || Redline || Displacement || Fuel || Bores || Stroke || Compression || MPG |- | LS6|| Horsepower || Torque || Redline || Displacement || Fuel || Bores || Stroke || Compression || MPG |- | LQ9 || Horsepower || Torque || Redline || Displacement || Fuel || Bores || Stroke || Compression || MPG |- | L33 || Horsepower || Torque || Redline || Displacement || Fuel || Bores || Stroke || Compression || MPG |- | LS3|| 426 @ 5900|| 420 @ 4600|| 6600 || 6.2 || Fuel || 4.06'|| 3.62'|| 10.7:1 || 16/24/ |- | LS99|| 400 @ 5900 || 410 @ 4300|| 6200 || 6.2 || Fuel || 4.06'|| 3.62' || 10.4:1 || 16/25/ |- | LSA || 550 @ 6100 (est)|| 550 @ 3800 (est)|| 6200 || 6.2 || Fuel || 4.06'|| 3.62' || 9.1:1 || MPG |- | LSX376|| 450 @ 5900|| 444 @ 4600|| 6600 || 6.2 || 92 || 4.06' || 3.62' || 9:1|| MPG |- | LSX454|| 620 @ 6200|| 590 @ 4800|| 6500|| 7.4 || 92 || 4.185 || 4.125|| 11.0:1|| MPG |- | LSX454R|| 720 @ 6000|| 720 @ 4500|| 6500|| 7.4 || 92 || 4.185 || 4.125|| 13.1:1|| MPG |- | Engine || Horsepower || Torque || Redline || Displacement || Fuel || Bores || Stroke || Compression || MPG |} ==See also== *[[Buick V8 engine]] *[[Cadillac V8 engine]] *[[Oldsmobile V8 engine]] *[[Pontiac V8 engine]] *[[Chevrolet Big-Block engine]] *[[GM LT engine]] – Generation II small block *[[GM LS engine]] – Generation III/IV small block *[[Chevrolet 90-Degree V6 engine]] - a V6 version of the original small-block removing cylinders 3 and 6, still in production *[[List of GM engines]] ==References== *{{cite web|url= : [[Category:Chevrolet engines|Small-Block]] [[fr:Chevrolet small-block engine]] [[it:Chevrolet Small-Block]] [[sv:Chevrolet Small block]] Wikipedia wrote: uI am not using Virtuoso, so I cant use bif-contains. The intent is to get a list of all ?x where label has the substring \"engine\" So I would expect the result to be something like: label x Jet engine id1 Steam engine id2 Rocket engine id3 Search engine id4 I have used the following type of SPARQL on other endpoints That don't use Virtuoso and it works fine. Why doesn't this work for dbpedia???? PREFIX rdf: PREFIX rdfs: SELECT ?label ?x WHERE { ?x rdfs:label ?label. FILTER regex(?label, \"(?i)engine\"@en). } Thanks, John uOn 16 Dec 2009, at 00:43, John Abjanic wrote: I haven't seen this (?i) syntax before. I would have tried something like regex(?label, \".*engine\", \"i\") .* matches any sequence of characters. The third parameter should make the match case insensitive. I don't think that the @en is going to work here, you'll have to use a separate FILTER with lang(?label)=\"en\" or something like that. I haven't tried anything of the above, so my apologies if it's rubbish :-/ Richard uHi Richard, I tried what you suggested and it did not work. I get a: HttpException: HttpException: 500 SPARQL Request Failed: rethrew: HttpException: 500 SPARQL Request Failed I have tried this type of query on 2 other web site endpoints and it works quite well. PREFIX rdf: uJohn, it's not my site, I'm just subscribed to the mailing list and sometimes respond to questions if I happen to know the answer. The suggestion that someone else posted before" "Inconsistencies in dbpedia extracted data / Group of templates" "uHi, I have questions regarding group of infoboxes/templates which are designed to be used together to provide additional data for a resource. Specifically, I have been checking the Infobox \"Template:Starbox begin\" [1] (and the other templates in the same \"group\"). This template should be used in conjuction with other templates to add properties to articles which refer to stars. If I check which pages link to that template I see that there are hundreds [2], while in the mappings page it seems to occur only in one article [3] [3b] One of the articles which actually uses that template in Wikipedia is Algol (dbpedia: [4], wikipedia: [5]). As you can see it used the \"Template:Starbox begin\" when the dbpedia extraction was run (from what I can see in the old revision). But on dbpedia, Algol is not using \"Template:Starbox begin\", moreover it has a dbpedia-owl:Writer rdf:type since there is a Infobox_writer in the article. My questions are: 1) are the \"Template:Starbox begin\" occurrences correct in the mappings statistics page? 2) what is the approach used by the dbpedia extraction framework in case different infoboxes are present in a wiki article? 3) how to deal with group of templates which provide properties for the same resource? Thanks Andrea [1] [2] [3] [3b] [4] [5] Hi, I have questions regarding group of infoboxes/templates which are designed to be used together to provide additional data for a resource. Specifically, I have been checking the Infobox 'Template:Starbox begin' [1] (and the other templates in the same 'group'). This template should be used in conjuction with other templates to add properties to articles which refer to stars. If I check which pages link to that template I see that there are hundreds [2], while in the mappings page it seems to occur only in one article [3] [3b] One of the articles which actually uses that template in Wikipedia is Algol (dbpedia: [4], wikipedia: [5]). As you can see it used the 'Template:Starbox begin' when the dbpedia extraction was run (from what I can see in the old revision). But on dbpedia, Algol is not using 'Template:Starbox begin', moreover it has a dbpedia-owl:Writer rdf:type since there is a Infobox_writer in the article. My questions are: 1) are the 'Template:Starbox begin' occurrences correct in the mappings statistics page? 2) what is the approach used by the dbpedia extraction framework in case different infoboxes are present in a wiki article? 3) how to deal with group of templates which provide properties for the same resource? Thanks Andrea [1] Algol?oldid=495281281 uHi Andrea, On Thu, Jan 10, 2013 at 7:08 PM, Andrea Di Menna < > wrote: There is a minPropertyCount limit in the statistics to reduce general purpose templates and it is set to '2'. Starbox_begin template has only one label property and that is why it is miscalculated. There must be one wrong instance with two properties defined that's why you see only one. This is a known problem but the framework doesn't handle this type of information. Whenever there is a mapping the framework expects a class definition and it either assigns it in the article resource or in a new \"intermediate\" resource ( in the case of multiple defined mappings ) To solve this I think we should add a \"noMapToClass\" property in the templateMapping and whenever the framework reads that definition it just adds the mapping output directly to the main resource without rdf:type info Would you like to help in this regard? We could of course help you and provide you with repo access uHi Dimitris, 2013/1/10 Dimitris Kontokostas < > I think the wrong instance was was missing closing brackets in the Starbox_begin template [1]. The typo has been corrected after the release of dbpedia 3.8 [2], in fact in live DBpedia there is no entity known to be using that template. My questions are then: 1) are the statistics calculated on dbpedia data or on wikipedia live data? 2) are the templates with numOfProperties < minPropertyCount only hidden from the templates or also not processed during mappings extraction? Hiding them from the statistics makes sense to me, but if it is possible to create mappings for them I would like to understand how a DBpedia mapping contributor can be informed about the existence of such templates. I personally use the statistics page to decide what to work on first and in case of a missing template in that page, it would be difficult for me to get to know other possible candidates. Moreover, the Starbox_* templates are used in more than 2k articles, which could lead us to at least assign proper types to a big set of entities. 2) what is the approach used by the dbpedia extraction framework in case I agree with this approach and I would love to contribute. I just need a bit of time to get used to the dbpedia extraction framework code and also with Scala :P Also, wouldn't it be possible to avoid creating a new intermediate resource when there are multiple templates which map to the same ontology class? If the resource has been already assigned a class, and another template in the article maps to the same class then you do not create another \"fake\" resource and add ontology properties to the original resource. Otherwise you create a new resource and add properties to that Does that make sense? Thanks Andrea uHi Andrea, On Fri, Jan 11, 2013 at 10:28 PM, Andrea Di Menna < > wrote: The statistics are generated per dbpedia release, so these numbers are from ~June 2012 There are 2 infobox extractors in the framework and they are independent to each other: the infoboxExtractor and the mappingsExtractor. InfoboxExtractor generates triples in the dbprop name space but discards some templates / properties according to [1], [2] to remove probably unwanted triples in line 131 [3] we do a variation of the output to generate the basis for the statistics. If we lower these restrictions the problem will get worse and the statistics table will be filled with a big number of formatting (or whatever) templates unless you have any suggestions on this As I said the mappings extractor is independent so whatever you map in the mappings wiki will be mapped by the framework regardless of the statistics [1] [2] [3] take all the time you need :) This approach looks good and it's probably easier too ;) Best, Dimitris" "SEMANTiCS 2014 - Networking for Funding (till Sept. 3rd, 10 am)" "uDear Colleagues,  SEMANTiCS2014 is a venue, where different communities meet. These communities include business and research, and also different horizontals and verticals. Thus, SEMANTiCS has a huge potential in shaping the European research and innovation with regard to applying different technologies (semantic, data, language, and others) for solving real world problems.  We would like to take this opportunity and create a venue at the conference – a session that may have an impact on our fields in the future. The session will sketch directions we are targeting and link parties working on complementary topics or searching for solutions of problems. We will go beyond a typical H2020 matchmaking session. The voice of the community will be heard. As Linked Data technology has matured and has proven its effectiveness in some areas, we – as a community – should discuss what strategies there are to drive the Linked Data economy to a boom in Europe.  In this 60-minute session, we will listen to Linked Data leaders and experts from the community about the potentials and current shortcomings of Linked Data as well as their individual opinion about the direction Horizon 2020 should go to provide effective support for Linked Data adoption by businesses. The session is open for everyone that would like to share his point of view. This may be done, filling in the form: up consortia to apply for funding. We will allow all attendees of SEMANTiCS to submit an organisation profile description, so they can be contacted by potential partners. These descriptions will be made available to all attendees of SEMANTiCS in a printed form to facilitate networking at (and after) the conference. If you are interested, we would kindly ask you to fill out this form: Hellmann" "How to get involved" "uHi, My name is Atul. I have fairly good skills and experience in Java. But, I don't know much about Scala. If not knowing much about Scala is not a big issue for now, I would like to contribute to DBpedia. Since I think I'm good at learning so I'll be learning Scala along. Please guide me on how to get started. I know there are already some threads on this but still it would be of great help to me if you can suggest me some easy bugs matching my capabilities. I'll appreciate any help. Thank You and Regrads, Atul Hi, My name is Atul. I have fairly good skills and experience in Java. But, I don't know much about Scala. If not knowing much about Scala is not a big issue for now, I would like to contribute to DBpedia. Since I think I'm good at learning so I'll be learning Scala along. Please guide me on how to get started. I know there are already some threads on this but still it would be of great help to me if you can suggest me some easy bugs matching my capabilities. I'll appreciate any help. Thank You and Regrads, Atul uHi Atul & welcome to DBpedia, We are currently re-organizing our get-involved pages They need some more work to be complete but should be enough to get you started. Go through those pages and come back to us with more questions :) We use the dev lists for developer related discussions Cheers, Dimitris On Sun, Jan 17, 2016 at 9:42 PM, Atul Sharma < > wrote:" "Wiki Mapping" "uGreetings everyone, I would be thankful if any one could let me refer to some material or article on how to use DBpedia mapping API, beside [1] and [2]. I am interested in changing the mapping template mostly by including some more keywords and making respective changes in source code of the extractor. Currently I am in the process of installing required software and make extraction server work by following [2]. Please let me know your comment on, \"would above thinking accomplishes the above mentioned task or I need to make some changes\" ? Thanking you in advance, Regards, Ankur Padia. Reference: uHi Ankur, I don't know any other sources of documentation about the mappings except maybe that you shouldn't use the development info from the main site since it is outdated, use the info from the Github wiki instead [1]. It would also be helpful if you tell us in more detail what you mean by Mapping API and keywords, so we can better help you. Cheers, Alexandru [1] On Apr 18, 2014 10:23 AM, \"Ankur Padia\" < > wrote: uHello Alexandru, Thank you for help. I am trying to solve an issue [1] on extraction framework. I have downloaded the git repo along with maven 3.0 and currently working with Intellij IDEA. However, I am not able to figure which class (of the project) should I pick to start with. Also, with reference to the GSoC 2014 proposal (if goes successful) on Ontology Enrichment through Pattern Extraction, I would be required to modify mapping to reflect new axioms. I found issue [1] to be more relevant. Can you please let me know how to proceed ?. Is there any architectural pipeline or the flow of execution explain the how the information is transferred between packages. Or some thing relevant will be of some help. Thanking you in advance. Regards, Ankur Padia. On Fri, Apr 18, 2014 at 4:14 PM, Alexandru Todor < >wrote:" "Connecting 30k tags to URI's" "uHi All I've got about 30k terms that I wanna get dbpedia, da.wikipedia.org and en.wikipedia.org URI's for. Right now I'm doing this sparql against dbpedia.org/sparql : prefix foaf: prefix vcard: SELECT * WHERE { $uri ?p ?s FILTER ( (?p = rdfs:label && ?s = 'Johannes Wehner'@da) || (?p = rdfs:label && ?s = 'Johannes Wehner'@en) || (?p = foaf:name && ?s = 'Johannes Wehner') || (?p = vcard:FN && ?s = 'Johannes Wehner'@en) ) } I get it in xml-format and i parse it. It sorta works, but it takes forever. What should I do? Should do it locally? Or can I speed it in another way? Besides time a big concern is danish results. I've tried using know if you have any ideas on that as well. Have a nice friday! Best, /Johs. uJohannes Wehner wrote: DBpedia is when all is said an done a database, one that is uniquely open to the entire world to access. Without any sense of self protection the entire endeavor would be impractial. There are deliberate controls in place on the Virtuoso backend that ensure the engine is protected accordingly. Please use LIMIT and OFFSET to build data windows that work within the parameters of the server side protections. Of course you can also install and load a local version of Virtuoso [1] or simply instantiate an Amazon EC2 cloud AMI [2]. Links: 1. 2. VirtAWSPublicDataSets uHello, I'm assuming you are doing 30k SPARQL queries also. So let's say you have a network latency of 100ms, that would be almost an hour just for the network. One more reason to do it locally. The plain datasets can be downloaded at: Or you can use the links given below. Regards, Sebastian Kingsley Idehen schrieb:" "2nd LIDER Hackathon Preparation call: today at 14:00" "uApologies for cross-posting. This is a kind reminder about the weekly Google Hangout to prepare for the LIDER Hackathon in Leipzig (Sept 1st). The preparation Hangouts will happen each Tuesday at 2pm Leipzig time until the event. Links to join can be found here: You are still able to submit topics for hacking. Please add them to this document: or send an email to Bettina Klimek < > Currently we have the confirmed topics below. Furthermore we have experts available that will help you to get in touch with Linked Data and RDF and help you to bring your own tools to the Semantic Web world. T7: [Confirmed] Roundtrip conversion from TBX2RDF and back The idea of this is to work on a roundtrip conversion from the TBX standard for representing terminology to RDF and back. The idea would be to build on the existing code at bitbucket: Potential industry partner: TILDE (Tatiana) Source code: TBX Standard: Contact person: Philipp Cimiano, John McCrae, Victor Rodriguez-Doncel T8: [Confirmed] Converting multilingual dictionaries as LD on the Web The experience on the creation of the Apertium RDF dictionaries will be presented. Taking as starting point a bilingual dictionary represented in LMF/XML, a mapping into RDF was made by using tools such as Open Refine . From each bilingual dictionary three components (graphs) were created in RDF: two lexicons and a translation set. The used vocabularies were lemon for representing lexical information and the translation module for representing translations. Once they were published on the Web, some immediate benefits arise such as: automatic enrichment of the monolingual lexicons each time a new dictionary is published (due to the URIs ruse), simple graph-based navigation across the lexical information and, more interestingly, simple querying across (initially) independent dictionaries. The task could be either to reproduce part of the Apertium generation process, for those willing to learn about lemon and about techniques for representing translations in RDF, or to repeat the process with other input data (bilingual or multilingual lexica) provided by participants. Contact person: Jorge Gracia T9: [Confirmed] Based on the NIF-LD output of Babelfy we can try to deploy existing RDF visualizations out of the box and query the output with SPARQL Babelfy is a unified, multilingual, graph-based approach to Entity Linking and Word Sense Disambiguation. Based on a loose identification of candidate meanings, coupled with a densest subgraph heuristic which selects high-coherence semantic interpretations, Babelfy is able to annotate free text with with both concepts and named entities drawn from BabelNet ’s sense inventory. The task consists of converting text annotated by Babelfy into RDF format. In order to accomplish this, participants will start from free text, will annotate it with Babelfy and will eventually make use of the NLP2RDF NIF module . Data can also be displayed using visualization tools such as RelFinder . Contact person: Tiziano Flati ( ), Roberto Navigli ( )" "Extraction Framework (Questions arised from the mapping marathon)" "uHi Mariano, I don't have answers for everything, but here goes my 2c. (split by subject) EXTRACTION It seems that the extraction process reads the properties found in the I think so. All wiki pages have a delete tab, but we do not know if it is an immediate AFAIK, it's immediate. When we create a DBpedia class or property, when it becomes effective?, AFAIK, it's immediate. What do you mean \"life cycle\"? Changes show up in live.dbpedia.org nearly immediate and on dbpedia.org in the next release (usually twice a year for the entire data & as frequent as you want for your localized version. Eg: In the statistics of (es) Ficha_de_futbolista we can find the property What happens if you map only one? Maybe the infobox itself is doing some resolution there? When you say you get two triples, do you mean you get one for the mapped and one for the non-mapped property. The parsing of spanish dates (dd/mm/yyyy) does not work (property mapped to You can patch the Date and Decimal extractors to take some i18n config params. There is a current debate about this in the i18n committee. The current solution is to generate the triples under bypass this step at least in cases where we're more confident that the link is true (for example with bidirectional language links). Feel free to join the discussion: is there any scheduling for the next dump? We are anxious about knowing how Generalized dumps for the entire (Internationalized) DBpedia usually happen twice a year. The international chapters are free to release their data in any release cycle they see fit. So you may just run the extraction framework on your side and tell us how many triples you get. We are also curious! :) Whenever the machine is set up, please e-mail dbpedia-developers with the IP and the responsible party will set up the domain forwarding. Folks, anybody else can chip in? Cheers, Pablo uHi, EXTRACTION This is correct. The infobox definition is *not* taken into consideration Eg: In the statistics of (es) Ficha_de_futbolista we can find the property I am also confused here. Pablo is correct on the distinction (ontology - property). If this is not the case, you should map them both to the same DBpedia ontology property. If the infobox has both of them defined then you will get 2 triples, otherwise you will get only one. At least you should :-) You could take a look at [1]. I updated it recently. We could also create an I18n FAQ page for similar questions. Maybe the Spanish guys can gather all their questions in an page (i.e. [2]) and we could help them write the answers :-) Cheers, Dimitris [1] [2] Hi, EXTRACTION It seems that the extraction process reads the properties found in the infobox instances, without checking if those properties are in the infobox definition . is that so? I think so. This is correct. The infobox definition is not taken into consideration Eg: In the statistics of (es) Ficha_de_futbolista we can find the property 'altura' as one of the most used, but that property is not in the infobox definition. In the infobox definition we can see 'estatura' (a concept similar to 'altura') but is much less used that 'altura'. Do we have a mechanism to map both infobox properties to the same DBpedia property? We tried creating two mappings, one for 'altura' and another for 'estatura', but we get always two triples for each infobox instance (although the instance has only one of these properties). Any solution? What happens if you map only one? Maybe the infobox itself is doing some resolution there? When you say you get two triples, do you mean you get one for the faq uDimitris, Great idea! A sort of \"one act of kindness generates another,\" or \"pay it forward.\" Mariano, Oscar, can it be done? Cheers, Pablo On Tue, Nov 8, 2011 at 7:16 PM, Dimitris Kontokostas < >wrote: uSure!! It is done!! At last it is there. It is a simple cut&paste; of our conversations, and it can be improved in many aspects (more categories, or links to a detailed pages with examples), but it is a first step. I promise to clean it up and to add links to a \"Best practices for mapping\" page. Feel free to add/modify what you consider appropriate. Best regards," "Ontology maintenance" "uHi, I am adding Dutch labels to the ontology and in the process I'm changing the old notations to the new coding style. While working my way through a lot of properties i run into stuff like this: {{ObjectProperty | labels = {{label|en|person function}} {{label|nl|persoon functie}} | rdfs:range = PersonFunction }} In my opinion it would be more complete/precise if the rdfs:domain is declared (Person), because obviously this property is intended to connect Person to PersonFunction. I am tempted to add it to improve the ontology. What do you think; how to proceed in a situation like this? Thanks, Roland uHi Roland, On 11/20/2012 12:23 AM, Roland Cornelissen wrote: I agree with you. You can go ahead and improve it. uHi Mohamed, And what about this one: {{DatatypeProperty |labels = {{label|de|Fläche}} {{label|en|area total}} {{label|nl|oppervlakte}} {{label|el|έκταση_περιοχής}} {{label|fr|superficie}} |rdfs:domain = Place |rdfs:range = Area }} To me it appears strange to express the range of this property as Area. I think the area total is expressed in some kind of metrics, like for example 128,37 km². Is this an error or am i missing something here? Thanks, Roland On 11/21/2012 03:17 PM, Mohamed Morsey wrote: uHi Roland, On 11/21/2012 08:12 PM, Roland Cornelissen wrote: Property \"areaTotal\" uses the SI units, so it should be expressed in square meter. Please have a look on that thread [1], as it was discussing that property in details. [1] msg03623.html uThat's interesting, apparently there are 2 different notations of the property areaTotal. The one in the mappingserver [1] and live [2] are different from the 'main' one [3] which is the correct one afaik. In [2] there is no rdf:type recognised although it's declared. Not sure what's going on here? Thanks, Roland [1] [2] [3] On 11/21/2012 08:45 PM, Mohamed Morsey wrote: uHi Roland, On 11/21/2012 09:31 PM, Roland Cornelissen wrote: the ontology loaded to DBpedia-Live gets an update feed from the mappings wiki, so it should contain the most recent version of the ontology. problem fixed. [1] dbpedia_3.8.owl.bz2 uOn Wed, Nov 21, 2012 at 9:31 PM, Roland Cornelissen < > wrote: [3] shows the ontology as it is in the OWL file, while [1] and [2] show it as it is in the mappings wiki. When we generate the OWL file, we deliberately switch the range from dbpedia-owl:Area to xsd:double. I don't remember whyAn update by Robert on Mar 31 2010 introduced that behavior: uHi, There are two distinct ObjectProperties mentioned in the DBpedia ontology that (imho) declare the same thing, those are: {{ObjectProperty | labels = {{label|en|author}} {{label|nl|auteur}} | rdfs:domain = Work | rdfs:range = Person | owl:equivalentProperty = schema:author }} and {{ObjectProperty | rdfs: = writer | rdfs: = ????????????? | rdfs:domain = Work | rdfs:range = Person }} In a mapping [1] where both properties are found, they are mapped each to the same infobox-field. Which proves they are the same. Not sure if that always holds, but I would say that these properties need to be merged somehow. What would be the correct or best way to handle this? Any advice would be appreciated. Thanks Roland [1] On 11/20/2012 12:23 AM, Roland Cornelissen wrote: uHi Roland, Thanks for looking into this. We do need more people doing quality checks. I wouldn't say that the overlap in infobox fields proves that they are equivalent. It only shows evidence that they are related. In this particular case, isn't writer a subproperty of author? Authors of songs could be writers, composers, etc.? But authors of (textual) books are writers? I do not know the best way to handle this (Jona?), because I imagine that before merging two properties in the ontology we need to make sure that all infobox fields in all languages will still make sense after the merge. Perhaps we need to flag every affected mapping (via discussion page?) & alert the chapters to the change, which then need to give an ok at the discussion page? Any better ideas? Also, in this particular case it may be safer to \"merge up\" from the more specific to the more generic, at the cost of losing information (specificity). Cheers Pablo On Nov 30, 2012 9:35 PM, \"Roland Cornelissen\" < > wrote: uHi, I have a concern also with the relation WorldChampion : which domain is SnookerChamp and range xsd:gYear !! I believe the right thing to do here would be to introduce an intermediate node of some class \"CompetitionTitle\" with a property date. Is there some good practice for intermediate nodes ? - Should we use the same types that for resources that are subject of wikipedia articles ? - Should we also keep direct properties (without intermediate node), for instance rename the \"worldChampion\" relation here in \"worldChampionIn\" ? - Is there a way to give the reverse property to \"correspondingProperty\" in the IntermediateNodeMapping ? For instance as we have now dbpedia:Joe_Davis dbpedia-owl:worldChampion \"1927\" With the IntermediateNodeMapping we can get : dbpedia:Joe_Davis dbpedia-owl:title dbpedia:Joe_Davis_1 dbpedia:Joe_Davis_1 rdf:type dbpedia-owl:CompetitionTitle How can we also generate : dbpedia:Joe_Davis_1 dbpedia-owl:person dbpedia:Joe_Davis Cheers, Julien uHi Julien, I created a similar example some time ago for this case with officeholder to distinguish time periods (look at the end of I see your point here but this information is already extracted witth the reverse property in the first case isn't the second triple reduntant? BTW, I dont know if we can extract a triple this wayThe ConstantMapping could be an option but I don't think it's supported as-is Best, Dimitris On Mon, Dec 10, 2012 at 12:09 PM, Julien Cojan < >wrote: uHi Dimitris , uHi Julien, On Tue, Dec 11, 2012 at 5:55 PM, Julien Cojan < > wrote:" "An offence to the Queens" "uHi all I like stumbling on funny bugs in DBpedia classification, like this one yago:Queen102313008 rdfs:subClassOf yago:Insect102159955 Check the instances of the Queen class, well many seem somewhat rather human-like - although there are some elves as well in the list. I say, this is *shocking* indeed. No more 'Sir-ification' to hope for the Semantic Web community I'm afraid, if nothing is done quickly about it. :-) Bernard uHi Bernard, Well, if humans are typed has Queen102313008; it is indeed shocking :) Some backgrounds of the issue here: This queen refers to this WordNet word sense: {02313008} [05] S: (n) *queen#1* (the only fertile female in a colony of social insects such as bees and ants and termites; its function is to lay eggs) That is an insect. The problem is not with Yago, but potentially with its integration into DBPedia (I didn't investigated further than by reading your mail). What one have to take into account when using Yago is that these concepts (as Queen) come from WordNet, so one concept has, in most of the cases, more than one word sense. Word senses are tagged with the WordNet word sense number (0231). Take care, Fred uHi Fred Frederick Giasson a écrit : It is! definitely human if you don't mind the pointed ears she looks more human than many creatures I've met in my life and which called themselves so. :-) Indeed. Certainly. Typical vocabulary clash > Understood. I'm sure DBpedia team is aware of those homonymy issues and will improve the extraction to handle them more correctly in future versions. Bernard" "Ampersand in dbpedia returned URI breaking Jena code" "uHi, The following sparql query: select distinct ?Concept where {[] a ?Concept Is the default query at the dbpedia endpoint It returns several URI's including the following one (notice the and sign): So DBPedia is returning URI's containing an ampersand. This is causing an exception in the Jena parser. How do I fix this? None of Jenas methods will work, I cant transofrm the resultset into a model or even print is with the resultformatter. If i iterate over it, I can print the results one by one till I get to the malformed URI. How do I check in my code for malformed URI's? Any ideas? Thanks! Marv" "Inconsistent Results: Set vs. Subset and Set - Subset" "uHello, I am trying to retrieve a list of predicates from DBpedia with a SPARQL query. The full set looks fine: SELECT DISTINCT ?b WHERE { ?a ?b ?c. } LIMIT 50 On Now, I just want a subset of this, namely only predicates whose last part in the URI starts with a letter from A to Z: SELECT DISTINCT ?b WHERE { ?a ?b ?c. FILTER(regex(str(?b), \"[/#][a-z][^/#]*$\", \"i\")). } LIMIT 50 This returns a surprisingly small set (let's call this set B) consisting of only one URI ( SELECT DISTINCT ?b WHERE { ?a ?b ?c. FILTER(regex(str(?b), \"[/#][^a-z/#][^/#]*$\", \"i\")). } LIMIT 50 I would expect the result set of this query to contain, for example, Why does DBpedia behave like that; is there anything wrong about my queries or my regular expressions? (When trying this on some other endpoints ( Thanks in advance uYou might be interested in this StackOverflow question in which someone trying to enumerate properties ran two queries and got two sets of results, but then ran the intersection of those queries and didn't get the intersection of those sets. No definitive answer was reached; the best guess was \"timeouts\". //JT On Wed, Nov 13, 2013 at 11:31 AM, Florian Haag < > wrote: uHmm, all right - is there any sign I could watch out for in the response; something like the DBpedia server saying \"This is your result BUT something internally timed out, so this result might be incomplete; handle with care.\"? I'm wondering because normally, I wouldn't run the complete query at all. I would prefer to run only the queries for the subsets in the first place, for the very reason of avoiding any timeouts that are bound to happen when I add an \"ORDER BY ?b\" clause to the query for the complete list. Regards Von: Joshua TAYLOR [ ] Gesendet: Mittwoch, 13. November 2013 19:11 An: Florian Haag Cc: Betreff: Re: [Dbpedia-discussion] Inconsistent Results: Set vs. Subset and Set - Subset You might be interested in this StackOverflow question in which someone trying to enumerate properties ran two queries and got two sets of results, but then ran the intersection of those queries and didn't get the intersection of those sets. No definitive answer was reached; the best guess was \"timeouts\". //JT On Wed, Nov 13, 2013 at 11:31 AM, Florian Haag < > wrote:" "Dealing with UTF8 IRIs in HTTP Sparql Queries" "uHi all, I am using the Jena Java framework for querying DBpedia end point using SPARQL, to get the type for all points of interest in German cities. I am facing no issue for places that have English DBpedia entries. But, when it comes to place names with utf encoding (German entry - umlaut, sharp S), this query returns no result. This seems to be an issue with the German DBpedia end point, as mentioned over here ( Even after referring to this, I am unable to solve the problem. I don't know how to work with QueryEngineHTTP. I am adding two code snippets - one that works (first one - query for Allianz Arena : which has an English entry in DBpedia) and one that doesn't work (second one - for Schloß Nymphenburg, that has a German entry). This might be a very trivial issue, but I am unable to solve it. Any pointers to a solution would be very very helpful. Thanks a lot! Code 1 - working : String service = \" ParameterizedSparqlString query = new ParameterizedSparqlString( \"PREFIX geo: \" + \"PREFIX dbo: \" + \"PREFIX dcterms: \" + \"SELECT * WHERE {\" + \"?s geo:lat ?lat .\" + \"?s geo:long ?long .\" + \"?s dcterms:subject ?sub}\"); query.setIri(\"?s\", \" QueryExecutionFactory.sparqlService(service, query.toString());ResultSet results = qe.execSelect();ResultSetFormatter.out(System.out, results); Code 2 - not working : String service = \" ParameterizedSparqlString query = new ParameterizedSparqlString( \"PREFIX geo: \" + \"PREFIX dbo: \" + \"PREFIX dcterms: \" + \"SELECT * WHERE {\" + \"?s geo:lat ?lat .\" + \"?s geo:long ?long .\" + \"?s dcterms:subject ?sub}\"); query.setIri(\"?s\", \" = QueryExecutionFactory.sparqlService(service, query.toString());ResultSet results = qe.execSelect();ResultSetFormatter.out(System.out, results); Cheers, Hi all, I am using the Jena Java framework for querying DBpedia end point using SPARQL, to get the type for all points of interest in German cities. I am facing no issue for places that have English DBpedia entries. But, when it comes to place names with utf encoding (German entry - umlaut, sharp S), this query returns no result. This seems to be an issue with the German DBpedia end point, as mentioned over here ( Cheers," "Virtuoso Update Time-frame" "uHi, I was wondering what sort of time-frame there was for the DBpedia.org/sparql endpoint to receive recent improvements from the Virtuoso team. I'm particularly interested in recent improvements which fixed the \"S0022 Error SQ200: No column s-12-2.t.\" bug for unioned subqueries. Thank you, Stephen Hatton Hi, I was wondering what sort of time-frame there was for the DBpedia.org/sparql endpoint to receive recent improvements from the Virtuoso team. I'm particularly interested in recent improvements which fixed the 'S0022 Error SQ200: No column s-12-2.t.' bug for unioned subqueries. Thank you, Stephen Hatton uStephen Hatton wrote: It will happen in the next day or so. We tend to leave dbpedia.org last due to its popularity etc Kingsley" "Querying multi-valued properties" "uHi, How to get only the first value of a multi-valued property with BDpedia in SPARQL. for example the predicate have sometimes more than 1 value. Thanks Regards Olivier Hi, How to get only the first value of a multi-valued property with BDpedia in SPARQL. for example the predicate < Olivier uHi Olivier, On 07/02/2013 02:40 AM, Olivier Austina wrote: You can simply use LIMIT, to get only 1 value, e.g.: SELECT * WHERE { dbpedia:France dbpedia-owl:areaTotal ?area.} LIMIT 1" "Dbpedia Spotlight indexation: train a linker based on similarity-thresholds" "uHello, on the index.sh script to index DBpedia Spotlight there is a instruction I do not know what means it, and didn't find information about it. The instruction is as follows: # train a linker (most simple is based on similarity-thresholds) *mvn scala:run -DmainClass=org.dbpedia.spotlight.evaluation.EvaluateDisambiguationOnly Could you explain me it? and how do you execute it? Thank you very much! Attentively, Jairo Sarabia Appstylus developer Hello, on the index.sh script to index DBpedia Spotlight there is a instruction I do not know what means it, and didn't find information about it. The instruction is as follows: # train a linker (most simple is based on similarity-thresholds) mvn scala:run -DmainClass=org.dbpedia. spotlight.evaluation. EvaluateDisambiguationOnly Could you explain me it? and how do you execute it? Thank you very much! Attentively, Jairo Sarabia Appstylus developer" ".Net code cannot read HTTP responses from DBPedia" "uHi all Not sure if this is the right mailing list for this but thought I should report this here. I've recently noticed that I can no longer retrieve data from DBPedia using code written with .Net as I receive the following error: The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF at System.Net.HttpWebRequest.GetResponse() All I'm trying to do is a HTTP GET on a resource URI such as The only headers I'm sending in my request are an accept header plus any .Net adds automatically, my accept header is as follows: application/rdf+xml,text/xml,text/n3,text/rdf+n3,text/turtle,application/x-t urtle,application/turtle,text/plain,application/json;q=0.9,*/*;q=0.8 A minimal example of code which reproduces this is as follows: using System; using System.IO; using System.Net; class DBPediaBugReproduction { public static void Main(string[] args) { HttpWebRequest request = (HttpWebRequest)WebRequest.Create(\" request.Accept = \"application/rdf+xml\"; request.Method = \"GET\"; try { using (HttpWebResponse response = (HttpWebResponse)request.GetResponse()) { Console.WriteLine(\"OK\"); StreamReader reader = new StreamReader(response.GetResponseStream()); Console.WriteLine(reader.ReadToEnd()); } } catch (WebException webEx) { Console.WriteLine(\"ERROR\"); Console.WriteLine(webEx.Message); Console.WriteLine(webEx.StackTrace); } } } Any help with this would be appreciated Rob Vesse PhD Student IAM Group Bay 20, Room 4027, Building 32 Electronics & Computer Science University of Southampton SO17 1BJ uHi Rob, The issue is fixed, please try again. We had a typo in recently added timegate link HTTP header . Best Regards, Mitko On Apr 22, 2010, at 1:05 PM, Rob Vesse wrote:" "Fetching selective data from DBpedia" "uHello everyone, I would like to use the DBpedia datasets for my first semantic-web project. So please help me with the following beginner question: I'm working with Jena framework and I have a long list cities across the world ~150 in my Joseki store. Now I want to fetch all the information related to these cities like, monuments, parks, stadiums, universities, etc that are in each one of these cities from DBpedia and store them into Joseki. I do not want to download the huge dataset dumps available on DBpedia site as I need data only about cities. So I would like to know how to achieve fetch only the required data programatically. Thanks. Hello everyone, I would like to use the DBpedia datasets for my first semantic-web project. So please help me with the following beginner question: I'm working with Jena framework and I have a long list cities across the world ~150 in my Joseki store. Now I want to fetch all the information related to these cities like, monuments, parks, stadiums, universities, etc that are in each one of these cities from DBpedia and store them into Joseki. I do not want to download the huge dataset dumps available on DBpedia site as I need data only about cities. So I would like to know how to achieve fetch only the required  data programatically. Thanks. uI have had similar problems with the size of DBpedia. A simple solution is to find the downloads that you are interested and filter out the triples using grep. If I am interested in the article category \"Butterflies\" and I suspect there are useful triples in the article_categories download I can use the following bash script and get a smaller set of triples that contain the references to the URI of interest. Hope this helps - Pete #!/bin/bash grep \" dbpedia_nt/categories/butterflies.nt On Fri, Dec 23, 2011 at 4:21 PM, Ravindra Harige < >wrote:" "mapping statistics doesn't work" "uHi everyone! What's up with mapping statistics The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later. uShould be running again now. Best, Volha On 7/15/2014 8:35 AM, Yury Katkov wrote:" "Non-deterministic SPARQL results" "uHello. We observed a non-deterministic behaviour of the DBpedia SPARQL endpoint. It gives different results when requesting a query again. E.g. [1] SELECT DISTINCT * WHERE { ?p . } gives 5 results the first time and 3 results after reloading. All of a sudden it doesn't know and anymore. The page for also doesn't recognize the inverse relation to the 2008 and 2012 season. Running [2] SELECT DISTINCT * WHERE { ?p . } the first time gives the both former missing (2008 and 2012) seasons. After reloading the result set is empty. So it seems it is somehow related to handling of IRIs with non-ascii characters, maybe in the cache? Did anybody observed similar? Can someone explain the reason for this and how to circumvent? Thanks Magnus [1] [2] sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query;=PREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0D%0APREFIX+smm%3A+%3Chttp%3A%2F%2Fpurl.org%2Fhpi%2Fsoccer-voc%2F%3E%0D%0A%0D%0ASELECT+DISTINCT+*+WHERE+%7B%0D%0A+%3Fp+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fsubject%3E+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FCategory%3A3._Fu%C3%9Fball-Liga_seasons%3E.%0D%0A%7D&format;=text%2Fhtml&timeout;=0&debug;=on" "Update: Virtuoso LOD Cloud Hosting Instance" "uAll, We've now added \"Entity Rank\" to the service at: Thus, when you enter a full text pattern a linked data coefficient based algorithm is used to determine entity rank in the results page. At the current time you simply see the \"search\" or \"find\" entity matches in the results page. If you want to see what's really happening at the SPARQL level just click on the \"SPARQL\" button. In the next day or so the results page will be embellished with horizontal bars to visualize the entity ranks. Auto complete will be in place later this week or early next week. As per our AMIs for DBpedia, MusicBrainz, NeuroCommons, and Bio2RDF, we are going to make an EC2 AMI edition of the LOD cloud hosting edition of Virtuoso once all the data is in. Thus, you will be able to simply switch on you personal or service specific edition of the LOD cloud etc Links: 1. VirtInstallationEC2" "Looking for topics in dbpedia" "uIf I have to fine some information about Physics course like Scalar products, vector products, what is momentum, torque etc, how can I get this information from dbpedia. Most of these information are available on wikipedia but I could not find it on dbpedia. If I have to fine some information about Physics course like Scalar products, vector products, what is momentum, torque etc, how can I get this information from dbpedia. Most of these information are available on wikipedia but I could not find it on dbpedia. uHi kumar, I would try to find the pages that describe these terms and get their DBpedia IRI e.g. for every Wikipedia page there is a DBpedia resource and DBpedia tries to associate this resource with facts, types, categories & provenance Cheers, Dimitris On Fri, Apr 29, 2016 at 5:22 PM, kumar rohit < > wrote:" "How to build a meaningful Taxonomy from Wikipedia Categories?" "uHi, I’m trying to leverage Wikipedia Category Network for a semantic processing application. A set of Wikipedia articles are extracted from the document and I want to build a meaningful hierarchical taxonomy using Wikipedia categories. In my experiments, I found that the original category network of Wikipedia is really messy. For example, when some articles are mentioned in a document, it leads to the whole category network! I haven’t use DBpedia before; I just really interested to know, if I leverage DBpedia, is it possible to have a meaningful taxonomy of categories with hyponym relations? xŸ>\"  uThe strength of the Wikipedia categories is that there are a lot of them and a lot of statements matching instances to categories. The weakness of categories is that they are completely disorganized. There are two good strategies for using the categories. One of them is to treat them abstractly and use them as inputs for numerical algorithms. For instance, you can use algorithms such as Kleinberg's Hubs and Authorities where categories are treated as hubs and instances are treated as authorities. Similarly you can create similarity scores based on the categories shared between items. I've used wikipedia categories to create my own well-defined categories such as \"things related to New York City\" or \"obscene things\" or \"things related to skiing\" In all of these categories you have things that are easy to ontologize, such as ski areas, and other things such as that are not easy to ontologize. Generally I've made these by doing waves of expansion and contraction, traversing the graph and adding inclusion and exclusion rules. In the past with half-baked tools I've been able to create good categories of 10,000 or so members in a day or so. With good tools it ought to be possible to work faster. On Thu, Dec 19, 2013 at 4:45 AM, Amir H. Jadidinejad < > wrote: uHi Amir We have done some work related to Wikipedia category processing as the GSOC-2013 project. We used Wikipedia leaf categories as the starting point. Leaf category is a Wikipedia category page that there is no links to any other category page/s. Next we have defined the concept called “Prominent Node”. We use following 3 factors to define a prominent node 1) The initial candidates for the prominent nodes were the parents of leaf categories. We have used Wikipedia database dumps as our main data source, specifically the tables “category”, “categorylinks”, “page ” and “Interlanguage” . 2) Then we find the ones that *head *of the category name is a plural word (e.g. Naturalized citizens of the United States:- pre-modifier {Naturalized}, *head {citizens}* and post-modifier {of the United States} 3) Then we get the number of interlanguage links for each prominent candidate category and defined that a prominent node at least it should have 3 interlanguage links. Then we did some clustering based on identified prominent category names and identified the concept that each prominent node belongs. So we have produced following type of Wikipedia hierarchy Concepts > Prominent nodes > Leaf nodes Please look at following links [1] ,[2] for more details. If you are looking for this kind of work i'm happy share my experience with you. [1] [2] Thanks On Thu, Dec 19, 2013 at 8:22 PM, Paul Houle < > wrote: uDear Kasun, It's a very useful project, congratulation. I just want to know if it's possible to leverage your method for a local set of documents (not all leaf categories)? Suppose that I have a set of text documents and I want to find the relatedness/similarity between them using abstraction levels in the category network. In this situation, I think, you need a further items to rank parent categories according to the initial leaf categories or modify the concept of prominent nodes to encompass more leaf categories. Please take a look at the following paper: Also, you can see some example of local taxonomies: It's completely related to my request. Please let me know if it's possible to leverage your method for a local set of documents (not all leaf categories)? Kind regards, Amir From: kasun perera < > To: Paul Houle < > Cc: Amir H. Jadidinejad < >; \" \" < > Sent: Friday, December 20, 2013 12:19 PM Subject: Re: [Dbpedia-discussion] How to build a meaningful Taxonomy from Wikipedia Categories? Hi Amir We have done some work related to Wikipedia category processing as the GSOC-2013 project.  We used Wikipedia leaf categories as the starting point. Leaf category is a Wikipedia category page that there is no links to any other category page/s.  Next we have defined the concept called “Prominent Node”.  We use following 3 factors to define a prominent node 1) The initial candidates for the prominent nodes were the parents of leaf categories. We have used Wikipedia database dumps as our main data source, specifically the tables “category”, “categorylinks”, “page ” and “Interlanguage” . 2) Then we find the ones that head of the category name is a plural word  (e.g. Naturalized citizens of the United States:- pre-modifier {Naturalized}, head {citizens} and  post-modifier {of the United States} 3)  Then we get the number of interlanguage links for each prominent candidate category and defined that a prominent node at least it should  have 3 interlanguage links.  Then we did some clustering based on identified prominent category names and identified the concept that each prominent node belongs. So we have produced following type of Wikipedia hierarchy Concepts > Prominent nodes >  Leaf nodes Please look at following links [1] ,[2] for more details. If you are looking for this kind of work i'm happy share my experience with you.   [1]  [2]  Thanks On Thu, Dec 19, 2013 at 8:22 PM, Paul Houle < > wrote: The strength of the Wikipedia categories is that there are a lot of uHi Amir Few questions to get more sense of your problem. On Sun, Dec 22, 2013 at 9:12 PM, Amir H. Jadidinejad < >wrote: It's possible to apply this method for selective list of leaf Wikipedia categories. How do you plan to create a match between your \"local set of documents\" and the Wikipedia leaf categories? How do you going to decide which document is related with which Wikipedia leaf category? Suppose that I have a set of text documents and I want to find the Yes agree, using more criteria/items to find the prominent nodes could give more accurate taxonomy. Also we have filtered-out following Freebase named entities from the concept list. /people/person /location/location /organization/organization /music/recordings Thanks" "Exception when querying DBpedia via JENA+SPARQL" "uDear DBpedians, this is a rather odd problem (see exception below) when querying DBpedia via JENA+SPARQL (sample code below). The query itself seems sane and can also be executed without any problems using the web interface: select distinct ?p where { ?p ?o} ORDER BY ?p The exception seems to occur when the iterator steps from to both of which do not contain any special characters or the like. Any ideas, anyone? Best, Heiko The exception I get is (message translated: \"XML document structures have to begin and end with the same element\"): javax.xml.stream.XMLStreamException: ParseError at [row,col]:[232,57] Message: XML-Dokumentstrukturen müssen innerhalb derselben Entität beginnen und enden. at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:592) at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.getElementText(XMLStreamReaderImpl.java:836) at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.getOneSolution(XMLInputStAX.java:506) at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.hasNext(XMLInputStAX.java:231) My code: QueryExecution qe = QueryExecutionFactory.sparqlService(\" \"select distinct ?p where { ?p ?o} ORDER BY ?p\"); ResultSet rs = qe.execSelect(); qe.close(); try { for (; rs.hasNext();) System.out.println(rs.next().get(\"p\")); } catch (Exception e) { System.out.println(e); } uHi Heiko, I'm not sure, but could you try closing the QueryExecutionFactory after the try/catch block and see if you still have the same error ? Regards, Alexandru On Thu, Nov 22, 2012 at 1:06 PM, Heiko Paulheim < uYes, the exception is still there. In fact, my test program only consists of those lines, so there is only one query issued. Best, Heiko Am 22.11.2012 14:37, schrieb Alexandru Todor: uHi Heiko, If you move the qe.close() statement after the try/catch block it will solve your exception. This type of errors appear when you close the QueryExecutionFactory before you access the resultset. I've tested it and it works fine: QueryExecution qe = QueryExecutionFactory.sparqlService(\" distinct ?p where { ?p ?o} ORDER BY ?p\"); ResultSet rs = qe.execSelect(); try { for (; rs.hasNext();) System.out.println(rs.next().get(\"p\")); } catch (Exception e) { System.out.println(e); } qe.close(); } Cheers, Alexandru On Thu, Nov 22, 2012 at 2:46 PM, Heiko Paulheim < uGreat, thank you, Alexandru! Best, Heiko Am 22.11.2012 15:28, schrieb Alexandru Todor:" "Why many mails are coming from this address!!!" "uWindows Posta cihazından gönderildi" "Getting started with importing dbpedia into Sesame" "uHowdy everybody, I'm doing some experiments with dbpedia and I'm trying to get the data sets imported into sesame and running into to troubles. Does anybody have an existing documentation on importing dbpedia .nt files, or perhaps scripts? I've seen some existing stuff running around on the mailing list for importing dbpedia into Virtuoso but not for Sesame. Cheers, -R. Tyler Ballance uR. Tyler Ballance wrote: uOn Tue, 25 May 2010, Kingsley Idehen wrote: uR. Tyler Ballance wrote: uOn Wed, 26 May 2010, Kingsley Idehen wrote: The problems I'm having is besides sparse mailing list topics, there's not a lot in the way of documentation on \"getting started with dbpedia\" at least for running a local mirror. So I honestly don't know what works best for me, I don't know what works at all! :( Some guidance on how to get going would be useful :) Cheers, -R. Tyler Ballance uR. Tyler Ballance wrote: uOn Tue, 25 May 2010, R. Tyler Ballance wrote: For what it's worth, I finally got things working properly with Sesame backed by PostgreSQL. I had to modify the console script to include the jars in /usr/share/java properly in the classpath of the invocation the console: #!/bin/sh JAVA_OPT=-mx512m lib=\"$(dirname \"${0}\")//lib\" cplib=\"/usr/share/java/\" java $JAVA_OPT -cp \"$lib/$(ls \"$lib\"|xargs |sed \"s; ;:$lib/;g\"):$cplib/$(ls \"$cplib\" | xargs | sed \"s; ;:$cplib/;g\")\" org.openrdf.console.Console $* With this setup properly, I was able to create a PostgreSQL-backed repository and start loading the .nt files in via: console> open dbpedia. dbpedia> load file:///path/to/wikipedia_links_en.nt. Cheers, -R. Tyler Ballance" "How to coerce revenue in sparql" "uI am trying to write a sparql query against a dbpedia sparql endpoint that says \"find all companies that have more than 1B revenue\". The problem I see is that the value of revenue for many companies, e.g. Kaiser Permanente, looks like this: \"1.3E9\"^^dbpedia:datatype/usDollar or for Southwest Airlines like this: \"US$11.0 Billion\"@en How do I coerce these to a number so that they show up in my results? thanks, Scott uWhich property are you using? With you should only get values of a type of currency. Cheers, Max On Fri, May 13, 2011 at 02:05, Scott White < > wrote: uThanks for the feedback. I was using dbpedia:ontology/revenue and dbpedia2:revenue. The dbpedia:ontology/revenue does look like it's only a value of type of currency but there seem to be a fair number of companies that do not have that property but dbpedia2:revenue instead. On Wed, Jun 8, 2011 at 6:58 AM, Max Jakob < > wrote: uIf by dbpedia2 you mean information coming straight from the infobox, with the same information entered in the Wikipedia page. For the companies that use an Infobox mapped in the wiki at mappings.dbpedia.org you should also see a Do you have examples of companies you do not see mapped? Cheers Pablo On Jun 8, 2011 6:56 PM, \"Scott White\" < > wrote: uAh I see. Yes here is an example (Southwest Airlines): On Wed, Jun 8, 2011 at 11:34 PM, Pablo Mendes < > wrote: uits Wikipedia page. For this template, in contrast to Infobox_company, the mapping to revenue is not defined yet. See If you add the following line to the mapping, the data will be included in the future. {{ PropertyMapping | templateProperty = revenue | ontologyProperty = revenue | unit = Currency }} Incomplete mappings or unmapped infoboxes are still the main source of incompleteness in the high quality /ontology/ namespace. We hope this will improve over time when more people are contributing on the mappings wiki. Cheers, Max On Thu, Jun 9, 2011 at 20:16, Scott White < > wrote: uThat is extremely helpful. Thanks for clarifying that. On Fri, Jun 10, 2011 at 1:11 AM, Max Jakob < > wrote:" "Browser requests to sparql endpoint" "uHello, I am working on a game and want to make queries to dbpedia's sparql endpoint. I am trying to get data as JSON but am regularly (but inconsistently) encountering \"Access-Control-Allow-Origin\" errors (currently developing in Chrome). I am using XMLHttpRequest objects with no custom headers and \"GET\" requests that look like the following: The queries seem to be well-formed since dropping the same links in the browser address bar yields valid responses. Originally I was trying to simply get JSON data from the resource URI itself by including a custom header: .setRequestHeader ( \"Accept\", \"application/rdf+json\" ) However, I encountered (seemingly) more CORS/XSS related errors then than I am getting now" "little data missing - city-class" "uKingsley, Orri, Mitko, looking through the data everything looks fine on first sight, except for one little dump, which I miss. It is named \"Location types\" on dbpedia-Homepage. The query select ?x from where {?x . } should be answered with some results, but I'm not getting any. Cheers Georgi uHi Georgi, The missing part is added to the demo3 database . Best Regards, Mitko Georgi Kobilarov wrote:" "Equivalent properties" "uHi, I was wondering if there is some dataset that contains owl:equivalentProperty triples for dbpedia properties. For example, I believe that the following properties are equivalent: However, this information cannot be found in dbpedia. According to SELECT COUNT DISTINCT * WHERE { ?s owl:equivalentProperty ?o . } there are only 31 equivalencies. When querying the LOD cloud cache at according to SELECT count distinct ?s WHERE { ?s owl:equivalentProperty ?o . FILTER(regex(?s, \"dbpedia\")) } there are 31 triples and according to SELECT count distinct ?s WHERE { ?s owl:equivalentProperty ?o . FILTER(regex(?o, \"dbpedia\")) } there are 87 triples. From the mappings wiki ( only 49 equivalencies. These numbers are rather small. Am I missing something, is there a data set containing these equivalencies somewhere? Best regards and thanks for any insights, Basil Hi, I was wondering if there is some dataset that contains owl:equivalentProperty triples for dbpedia properties. For example, I believe that the following properties are equivalent: < * WHERE { ?s owl:equivalentProperty ?o . } there are only 31 equivalencies. When querying the LOD cloud cache at ?s WHERE { ?s owl:equivalentProperty ?o . FILTER(regex(?s, \"dbpedia\")) } there are 31 triples and according to SELECT count distinct ?s WHERE { ?s owl:equivalentProperty ?o . FILTER(regex(?o, \"dbpedia\")) } there are 87 triples. From the mappings wiki ( extract only 49 equivalencies. These numbers are rather small. Am I missing something, is there a data set containing these equivalencies somewhere? Best regards and thanks for any insights, Basil uWikipedia infoboxes contain very specific information about things and are thus a very valuable source of structured information that can be used to ask expressive queries against Wikipedia. The DBpedia project currently extracts three different datasets from the Wikipedia infoboxes. 1. The *Infobox Dataset* is created using our initial, now three year old infobox parsing approach. This extractor extracts all properties from all infoboxes and templates within all Wikipedia articles. Extracted information is represented using properties in the properties directly reflect the name of the Wikipedia infobox property. Property names are not cleaned or merged. Property types are not part of a subsumption hierarchy and there is no consistent ontology for the infobox dataset. Currently, there are approximately 8000 different property types. The infobox extractor performs only a minimal amount of property value clean-up, e.g., by converting a value like “June 2009” to the XML Schema format “2009–06”. You should therefore use the infobox dataset only if your application requires complete coverage of all Wikipeda properties and you are prepared to accept relatively noisy data. 2. The *Infobox Ontology*. With the DBpedia 3.2 release, we introduced a new infobox extraction method which is based on hand-generated mappings of Wikipedia infoboxes/templates to a newly created DBpedia ontology . The mappings adjust weaknesses in the Wikipedia infobox system, like using different infoboxes for the same type of thing (class) or using different property names for the same property. Therefore, the instance data within the infobox ontology is much cleaner and better structured than the Infobox Dataset, but currently doesn't cover all infobox types and infobox properties within Wikipedia. Starting with DBpedia release 3.5, we provide three different Infobox Ontology data sets: - The *Ontology Infobox Types* dataset contains the rdf:types of the instances which have been extracted from the infoboxes. - The *Ontology Infobox Properties* dataset contains the actual data values that have been extracted from infoboxes. The data values are represented using ontology properties (e.g., 'volume') that may be applied to different things (e.g., the volume of a lake and the volume of a planet). This restricts the number of different properties to a minimum, but has the drawback that it is not possible to automatically infer the class of an entity based on a property. For instance, an application that discovers an entity described using the volume property cannot infer that that the entity is a lake and then for example use a map to visualize the entity. Properties are represented using properties following the schema. All values are normalized to their respective SI unit. - The *Ontology Infobox Properties (Specific)* dataset contains properties which have been specialized for a specific class using a specific unit. e.g. the property height is specialized on the class Person using the unit centimetres instead of metres. Specialized properties follow the schema (e.g. The properties have a single class as rdfs:domain and rdfs:range and can therefore be used for classification reasoning. This makes it easier to express queries against the data, e.g., finding all lakes whose volume is in a certain range. Typically, the range of the properties are not using SI units, but a unit which is more appropriate in the specific domain. All three data sets are available for download as well as being available for queries via the DBpedia SPARQL endpoint. See also On 12 July 2013 12:06, Basil Ell < > wrote: uDear Jona, thanks a lot for the clarification. I was not aware of the distinction between these two sets of properties. Best regards, Basil Am 12.07.2013 16:41, schrieb Jona Christopher Sahnwaldt: uHowever, one issue remains. I have seen that these sets can be separated. For example on the page I can read that this property belongs to the set of properties from the improved extraction approach (via rdfs:isDefinedBy But when I query the SPARQL endpoint using SELECT * WHERE { ?p ?o . } then I get the following result: p o From this information I cannot derive*) to which set the property belongs to. Are the datasets the HTML page is generated from and the data that can be queried via the endpoint different? *) I know that I could use a regular expression to filter out properties that do not contain \"dbpedia.org/ontology\", but I'd prefer using the rdfs:isDefinedBy property. Thanks again in advance, Basil Am 12.07.2013 17:32, schrieb Basil Ell:" "Mapping the DBPedia's citations to existing bibliographical data" "uHello, Concerning the DBPedia citations & references challenge, we report about a project that aims to map the DBPedia's citations to existing bibliographical data. Even though the deadline for the challenge has passed we would be grateful for your feedback about the project. More specifically, a number of properties of the enwiki-20160305-citation-data.ttl file have been used in order to facilitate the linking of the triples' subjects (found in the file) to URIs from other bibliographical sources. As a result, a total of 402,354 links were discovered, with 379,835 corresponding to distinct subjects. Emphasis has been given to the properties that represent identifiers, that can be found in other data sources and are relatively common. In particular, the properties isbn, isbn13, issn, doi, journal, series, periodical, magazine, oclc, pmid and arxiv have been used combined with the title and year. The linking of the data has been based on a number of LOD dumps that are available for download and bibliographical websites that provide their metadata through APIs. The project comprises of an application written in Java that processes and links the data and a triplestore which stores the original and the processed data. The following data sources have been used in the project: Data source Type Unique triples in local data dump DBPedia citations Data dump 76.2M DBLP - Digital Bibliography & Library Project Data dump 88.1M BNB - British National Bibliography Data dump 111M DNB - Deutsche Nationalbibliografie Data dump 414.2M BNE - Biblioteca Nacional de España Data dump 68.7M Springer Data dump 3.3M WorldCat API 2.1M PubMed API 0.629M arXiv API 0.021M The enwiki-20160305-citation-data.ttl file contains 76,223,926 unique triples with 12,391,363 distinct subjects. The results found in the project correspond to 379,835 / 999,679 = 38% of the distinct subjects extracted and to 379,835 / 12,391,363 = 3% of the entire file. The links found, are contained in the dbpedia_combined_links.nt.zip file and also can be queried from the following GraphDB Free SPARQL endpoint: A more detailed report about the project can be found at: Respectfully, David Nazarian uGreat work David! Thank you for the links and the detailed report Are you planning to open source the code that generates the links? We could try and integrate it into the DBpedia release publishing workflow. Cheers, Dimitris On Tue, Oct 18, 2016 at 2:28 PM, Δαβίδ Ναζαριάν < > wrote: uAwesome & looking forward to it! (cc'ing the wikicite community) pls note that it is DBpedia - lowercase p :) Cheers, Dimitris On Wed, Oct 19, 2016 at 1:08 PM, David < > wrote: uHello Dimitris, finally the project is complete :) It can be downloaded from the following link: Best Regards, David Nazarian On Wednesday, October 19, 2016 1:26 PM, Dimitris Kontokostas < > wrote: Awesome & looking forward to it!(cc'ing the wikicite community) pls note that it is DBpedia - lowercase p :) Cheers,Dimitris On Wed, Oct 19, 2016 at 1:08 PM, David < > wrote: Hello Dimitris, I'm very happy to hear that you find the project useful. The code is still under development but it's planned to be open sourced upon completion. It would be great if it could contribute to DBPedia. Best Regards, David Nazarian On Tuesday, October 18, 2016 3:15 PM, Dimitris Kontokostas < > wrote: Great work David! Thank you for the links and the detailed report Are you planning to open source the code that generates the links?We could try and integrate it into the DBpedia release publishing workflow. Cheers,Dimitris On Tue, Oct 18, 2016 at 2:28 PM, Δαβίδ Ναζαριάν < > wrote: Hello, Concerning the DBPedia citations & references challenge, we report about a project that aims to map the DBPedia's citations to existing bibliographical data. Even though the deadline for the challenge has passed we would be grateful for your feedback about the project. More specifically, a number of properties of the enwiki-20160305-citation-data. ttl file have been used in order to facilitate the linking of the triples' subjects (found in the file) to URIs from other bibliographical sources. As a result, a total of 402,354 links were discovered, with 379,835 corresponding to distinct subjects. Emphasis has been given to the properties that represent identifiers, that can be found in other data sources and are relatively common. In particular, the properties isbn, isbn13, issn, doi, journal, series, periodical, magazine, oclc, pmid and arxiv have been used combined with the title and year. The linking of the data has been based on a number of LOD dumps that are available for download and bibliographical websites that provide their metadata through APIs. The project comprises of an application written in Java that processes and links the data and a triplestore which stores the original and the processed data.  The following data sources have been used in the project: | Data source | Type | Unique triples in local data dump | | DBPedia citations | Data dump | 76.2M | | DBLP - Digital Bibliography & Library Project | Data dump | 88.1M | | BNB - British National Bibliography | Data dump | 111M | | DNB - Deutsche Nationalbibliografie | Data dump | 414.2M | | BNE - Biblioteca Nacional de España | Data dump | 68.7M | | Springer | Data dump | 3.3M | | WorldCat | API | 2.1M | | PubMed | API | 0.629M | | arXiv | API | 0.021M |  The enwiki-20160305-citation-data. ttl file contains 76,223,926 unique triples with 12,391,363 distinct subjects. The results found in the project correspond to 379,835 / 999,679 = 38% of the distinct subjects extracted and to 379,835 / 12,391,363 = 3% of the entire file.  The links found, are contained in the dbpedia_combined_links.nt.zip file and also can be queried from the following GraphDB Free SPARQL endpoint: Nazarian uGreat! Thank you for the updates David and open sourcing your project, We will try to give you some feedback soon On Sun, Feb 26, 2017 at 7:34 PM, David < > wrote:" "bif:contains and Film" "uHi all ! I have tried to get the resource when asking for \"film\" in the indexes in several ways: SELECT DISTINCT ?s ?o FROM WHERE {{?s rdfs:label ?o.[] a ?s .FILTER( bif:contains(?o, \"film\" ) )}}LIMIT 36 SELECT DISTINCT ?s ?o ?type FROM WHERE {{?s rdfs:label ?o. ?s rdf:type ?type .FILTER( bif:contains(?o, \"film\" ) )}}LIMIT 36 And even: SELECT ?s ?page ?label ?textScore AS ?Text_Score_Rank ( (?s) ) AS ?Entity_Rank WHERE { ?s foaf:page ?page ; rdfs:label ?label . FILTER( lang( ?label ) = \"en\" ) . ?label bif:contains 'film' OPTION (score ?textScore ) . } ORDER BY ASC ( (?s) ) But with no success so far. I also tried: SELECT DISTINCT ?s ?o ?type FROM WHERE {{?s rdfs:label ?o. ?s rdf:type ?type .FILTER( str(?o) = \"Film\" ) }} But it gives an out of time error Any idea how can I get keyword \"film\"? Thanks a lot Vanessa uOn 12/7/10 11:59 AM, Vanessa Lopez wrote: Vanessa, These pages should help you understand why you are having troubles: 1. ?url=http%3A%2F%2Fdbpedia.org%2Fresource%2FFilm uHi Kingsley, the thing is that I actually have no problems with my local version of Virtuoso, the query SELECT DISTINCT ?s ?o FROM WHERE {{?s rdfs:label ?o.[] a ?s .FILTER( bif:contains(?o, \"film\" ) )}}LIMIT 36 returns to me Film ( position E.g.: http://dbpedia.org/class/yago/1979Films film Using the latest version of DBpedia Thus, I can not figure it out why is different with the SPARQL end point Thanks anyway! Vanessa On 7 Dec 2010, at 17:53, Kingsley Idehen wrote: uOn 12/7/10 1:09 PM, Vanessa Lopez wrote: See: http://lod.openlinksw.com/c/CQF3N5M (LOD Cloud Cache instance). DBpedia instance is on a smaller setup (4 node cluster on a single box) vs LOD Cloud Cache (which is 8 node cluster across 8 physical machines). Thus, it has tighter constraints re. query timeouts etc Anyway, we double check if there's something else amiss. uHi Vanessa, try (there's a webservice available too ) Or use the Uberblic Search API, which is more powerful and up-to-date: Make sure to include the \"&fields;=source_uri\" parameter (because \"source_uri\" is an optional response field) data sources. An example of a more expressive query is 0Murray]&fields;=source_uri Cheers, Georgi" "Several loose notes to DBpedia staff" "uHi, I'm almost done on a work that I'm doing around DBpedia, and here is a collection of loose comments / suggestions that I want to ask you guys: - I've installed Virtuoso 6.0.0 on a server and loaded DBpedia, following the DBPedia installation script given published early this year on this mailing list, mostly to avoid hammering the DBpedia service for some upcoming jobs that I'm planning to do. It's working and with surprisingly no problems at the moment, but I miss the snorql service. How can I replicate it? - The prefixes as stated on the SNORQL page are somehow outdateddbpedia-owl is missing. - insisting (again) on a subject that I've addresses a couple of months agowhen are you planning on starting to create Core datasets for other languages? back then, the suggestion was to use LANG.dbpedia.org graphswell, I could really use it, and I'm ready to help you out in Portuguese. - DBpedia should have a sort of \"troubleshoot\" page, as there are many users (myself included) that fall for the same issues - for instance, the encoding threads - and work-around suggestions. - The DBpedia ontology is really a great resource, but I find it a little unbalanced uNuno Cardoso wrote: uOn Fri, May 29, 2009 at 3:26 AM, Nuno Cardoso < > wrote: u uHello, I'm perfectly happy with this bottom-up ontology. I understand that it's the natural way to do, looking at DBpedia's extraction approach, and it's proven to be quite useful. As DBpedia will evolve and more infobox classes are mapped, we can expect future finer-grained ontology versions, with more sub-classes, isn't? Nuno Cardoso, PhD Student. Try out my new named entity recognition service: Rembrandt - On Sun, May 31, 2009 at 19:41, Georgi Kobilarov < >wrote: uHello, we are currently working on something to be able to model the ontology in Wikipedia directly. There was a discussion whether it would be better to have an own database with mappings, rules and classes, which would allow a consistent view on the engineering or if we would include it in Wikipedia. Both approaches have advantages and it was not a light decision. Wikipedia is, after all, a wiki so it might never be possible to get consistent information out of it. So, there is a strong incentive to allow ontology engineering at the place where the data comes from We are still working on the live extraction and hopefully will face deployment soon, maybe within this week. Once this change arrives at the Wikipedia community, there will be many hands to clean up the extraction process and generate classes. By the way, did you offer some mails ago to set up a Portuguese DBpedia? Regards, Sebastian Hellmann, AKSW Nuno Cardoso schrieb: uSebastian Hellmann wrote: I think there's also the fact that somebody is always going to have a complaint about somebody else's ontology. In any large project, there are always going to be quirky decisions made." "Oracle is a RDBMS and a wise person" "uHey I'm hacking around with DBpedia and I'm encountering funny things, such as: one of the rdf:type is opencyc:en/RelationalDatabaseServerProgram but the description is: An oracle is a person or agency considered to be a source of wise counsel or prophetic opinion. and the owl:sameAs links are to opencyc:en/OracleDatabaseServer_TheProgram and freebase:Oracle (which is the URI that identifies a type of person) the skos:subject are dbpedia:Category:Prophecy and dbpedia:Category:Divination I'm sure there are a lot more things out there like this. Just wanted to let you all know. I'm trying to work on a disambiguation service, but this is going to be harder than I thought. Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org Hey I'm hacking around with DBpedia and I'm encountering funny things, such as: www.semanticwebaustin.org" "How to find the type of a concept?" "uHi, I'm a beginner with SPARQL and DBpedia. Looking for a query to get the type of a specific DBpedia URI. Exactly, I'm looking for 3 types:     -PERSON     -LOCATION     -ORGANIZATION I have two questions: 1. Would you please help me to construct an SPARQL query which gets a bunch of URIs and returns the above proper type of the corresponding concept? 2.How can I get more description about different types of concepts in DBpedia? Any suggestions and comments are welcome. Hi, I'm a beginner with SPARQL and DBpedia. Looking for a query to get the type of a specific DBpedia URI. Exactly, I'm looking for 3 types: -PERSON -LOCATION -ORGANIZATION I have two questions: 1. Would you please help me to construct an SPARQL query which gets a bunch of URIs and returns the above proper type of the corresponding concept? 2.How can I get more description about different types of concepts in DBpedia? Any suggestions and comments are welcome. uHi Amir, On 02/08/2013 11:56 PM, Amir Hossein Jadidinejad wrote: If I understand what you want to achieve correctly then the following query, asking for Paris as an example, should do it: SELECT * WHERE { dbpedia:Paris ?p ?o } If you want to get information about all DBpedia concepts then you can use following query: SELECT ?concept ?p ?info WHERE { {SELECT DISTINCT ?concept WHERE { ?s a ?concept } LIMIT 1000}. ?concept ?p ?info} In case you want to get the description of a specific DBpedia class, e.g. Person, then use the following one: SELECT ?p ?info WHERE {dbpedia-owl:Person ?p ?info} Hope that helps." "looking for co founder" "uHi I am William here, I saw your profile in I am developing a structured knowledge base system, looking for a co-founder/developer The aim of the website is to let user find the information quickly, easily also get more informative and useful information from the structured and linked knowledge (like Semantic Web). The website is at , to browse the knowledge try those link in Structured Knowledge section. when knowledge is structured, we can query specific question on the machine, like who lives in new york, know php language, looking for employment, age between 20-40 who is the president that reign the longest and still in power now That's one of the component, the website also have common functional tools, such as currency converter, unit converter, dictionary, calculator, simple text processing, etc. The special thing of these is, it's using the natural language understanding for user to interact with. I am looking for partner to enhance/develop further the website. Also enhance the quality of the strcutured knowledge by linking more information sources, other than DBPedia & CIA Factbook that it currently have. I have some money generating ideas, we can share this depends on the percentage of the contribution. I believe it has great potential in revenue generation, such as integrating affiliates link that's related to the knowledge. The ads will become a help rather the hinder to the user because it provide relevant information to what user searching for. Write to me if you are interested in growing QwerQ together with me :) Regards, William Hi I am William here, I saw your profile in William uVery very very sorry, please just ignore that message. It's not for dbpedia-discussion. On Thu, Sep 27, 2012 at 8:23 PM, william kisman < > wrote: Very very very sorry,  please just ignore that message. It's not for dbpedia-discussion. On Thu, Sep 27, 2012 at 8:23 PM, william kisman < > wrote: Hi I am William here, I saw your profile in William" "DBpedia dataset published using Hypernotation" "uHello, First, thank you for the great effort to make DBpedia available and improve it continuously. I would like to inform you that I used DBpedia as an example showing the benefits of Hypernotation [1]. Hypernotation is a new idea of publishing structured data on the Web I invite you to visit the DBpedia dataset [2] on the homepage, as well as browse the DBpedia data [3]. Regards, Vuk Milicic [1] [2] [3] u0€ *†H†÷  €0€1 0 +" "Getting Freebase onto the Semantic Web" "uHi John, John Giannandrea wrote: This would be great. Getting the data out on the Semantic Web as Linked Data also don't have to be a big effort as you are already having everything that is needed in place. I think for the first iteration it is completely OK if you define a new set of URIs for your schema. As a second iteration you could replace terms from your schema with terms from well-known vocabularies like FOAF or SKOS. 1. there would be a URI for each topic in Freebase and dereferencing this URI over the Web would return a RDF description of the concept using a Freebase specific schema. 2. this URI would be interlinked with other data sourcesin the LOD cloud, so that people could use Ssemantic Web browsers to navigate from these data sources into the Freebase data and so that Semantic Web crawlers can find and index the data. So, a minimal effort approach to getting Freebase onto the Semantic Web could look like this: 1. Define URIs for all your concepts, somethink like 2. Deploy a Linked Data wrapper around your API that returns an RDF description of (in the example above) the film when somebody dereferences the URI above. A very easy way to implement such a wrapper would be to just tweek the PHP script that we are using for the RDF Book mashup. The script is found at 3. Interlink this RDF Version of Freebase with other data sources. The simplest option here would be to interlink Freebase with DBpedia as both dataset contain Wikipedia article IDs. So what you would do is to add a RDF link stating that a specific concept in Freebase is the same as a concept in DBpedia to the RDF you return when one of your URIs gets dereferenced. For instance: owl:sameAs 4. You would send us an RDF file containing these RDF links for all Freebase concepts and we would load it into DBpedia and also serve these links. I think all this could be done within 3 days work and would allow Linked Data browsers, like the ones listed here to access and navigate between both datasets and would allow crawlers, like the ones listed here to index both datasets. What do you think? Technical background information about the whole process is found in After his, one could start thinking about also providing RDF dumps, so that people could load Freebase and DBpedia together into a RDF store and do whatever they want with the data. Or think about using well known terms from other vocabularies and ontologies. Using terms from well-known vocabularies as well as serving the data using different vocabularies is both important, but in my opinion something for the second step. First step: Publish linked data. See what people do with it. Cheers Chris" "Install error for DBpedia extraction framework" "uHi I am trying to install the DEF, when running the mvn install in the extraction directory. It does not build successfully, can point some light ? Thanks. [INFO] Scanning for projects[INFO] uHi William, We are under active development/structural changes and for dump-based extraction the dump branch is the stable one. This should change soon, but till then, try switching to the dump branch (hg update dump). Best, Dimitris On Fri, Sep 14, 2012 at 9:48 PM, William Kisman < > wrote: uHi, Okay, noted thanks :) On Sat, Sep 15, 2012 at 2:55 AM, Dimitris Kontokostas < >wrote: uHi I have made hg update dump Success on mvn install but fail on mvn scala:run in dump directory It say /home/release/wikipedia/wikipedias.csv (No such file or directory) Can give me more light ? Thanks dell:/home/william/Downloads/02a1dff9e208/dump # mvn scala:run [INFO] Scanning for projects[INFO] [INFO] uInside the dump folder try $/run download YourDownloadConfigFile (to download the wikipedia dumps) $/run extract YourExtractConfigFile (to start extraction) You can also use the provided files for default options Best, Dimitris On Fri, Sep 14, 2012 at 11:38 PM, William Kisman < > wrote: uI had the same error, if you are using 3.8 here's a short step by step guide : After compiling the sources : 1° Copy core-3.8.jar & dump-3.8.jar (located in /home/user/.m2/repository/org/dbpedia/extraction/dump/3.8/dump-3.8.jar and /home/dbpedia2/.m2/repository/org/dbpedia/extraction/core/3.8/core-3.8.jar) to the same folder (eg FOLDER) 2° Edit the file extraction.default.properties (located in extraction_framework/dump) to suit your needs and copy it to FOLDER 3° place yourself in FOLDER and try the following : scala -cp core-3.8.jar:dump-3.8.jar org.dbpedia.extraction.dump.extract.Extraction extraction.default.properties There was still an error after that, you can find it and its solution there : Regards, Olivier Sollier From: Dimitris Kontokostas [mailto: ] To: William Kisman [mailto: ] Cc: Sent: Fri, 14 Sep 2012 23:16:04 -0400 Subject: Re: [Dbpedia-discussion] Install error for DBpedia extraction framework Inside the dump folder try $/run download YourDownloadConfigFile (to download the wikipedia dumps) $/run extract YourExtractConfigFile (to start extraction) You can also use the provided files for default options Best, Dimitris On Fri, Sep 14, 2012 at 11:38 PM, William Kisman < > wrote: Hi I have made hg update dump Success on mvn install but fail on mvn scala:run in dump directory It say /home/release/wikipedia/wikipedias.csv (No such file or directory) Can give me more light ? Thanks dell:/home/william/Downloads/02a1dff9e208/dump # mvn scala:run [INFO] Scanning for projects[INFO] [INFO] uHi Olivier, Thanks for your reply. I have done those but still there's error. Missing the file wikipedias.csv dell:/home/william/universe/Downloads/02a1dff9e208/dump # scala -cp core-3.8.jar:dump-3.8.jar org.dbpedia.extraction.dump.extract.Extraction extraction.default.properties parsing /home/william/universe/dev/dbp-latest/wikipedias.csv java.io.FileNotFoundException: /home/william/universe/dev/dbp-latest/wikipedias.csv (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream. (FileInputStream.java:137) Hi Dimitris, I have downloaded the dump file successfully. But the extraction does not generate anything although it say BUILD SUCCESS. I have configured the base-dir in extraction.default.properties base-dir=/home/william/universe/dev/dbp-latest/dump/ This directory already contain the folder enwiki and downloaded files. What could be wrong from this ? dell:/home/william/universe/Downloads/02a1dff9e208/dump # /run extract config=extraction.default.properties [INFO] Scanning for projects [INFO] [INFO] uHi William, Did you download the files yourself, or did you use the DBpedia downloader? In the command lines you posted, I don't see this one, suggested by Dimitris as a fix: $/run download YourDownloadConfigFile (to download the wikipedia dumps) Abs, Pablo On Sun, Sep 23, 2012 at 8:59 AM, William Kisman < > wrote: uHi Pablo Yes, I did download using the DBPedia Framework, with the command given by Dimitris. dell:/home/william/universe/Downloads/02a1dff9e208/dump # /run download config=download.minimal.properties below is the config for download.minimal.properties # NOTE: format is not java.util.Properties, but org.dbpedia.extraction.dump.download.DownloadConfig # Default download server. It lists mirrors which may be faster. base-url= # Replace by your target folder. base-dir=/home/william/universe/dev/dbp-latest/dump/ # Replace xx by your language. download=en:pages-articles.xml.bz2 # Only needed for the ImageExtractor # download=commons:pages-articles.xml.bz2 # Unzip files while downloading? Not necessary, extraction will unzip on the fly. Let's save space. unzip=false # Sometimes connecting to the server fails, so we try five times with pauses of 10 seconds. retry-max=5 retry-millis=10000 Then I have these files downloaded dell:~ # ll /home/william/universe/dev/dbp-latest/dump/ total 4 drwxr-xr-x 3 root root 4096 Sep 23 11:19 enwiki dell:~ # ll /home/william/universe/dev/dbp-latest/dump/enwiki/ total 8 drwxr-xr-x 2 root root 4096 Sep 23 11:19 20120902 -rw-r uHi William, the problem is this line in extraction.default.properties: languages=10000- This means that all languages with 10000 articles or more should be extracted (and to find out which languages that are we would need the language list in wikipedias.csv). But you need only English, so you should change it to languages=en This should fix that problem. We should probably also have a file extraction.minimal.properties that works with download.minimal.properties, and we should have a better error message. Noted. Cheers, JC On Sun, Sep 23, 2012 at 11:07 AM, William Kisman < > wrote: uOn Mon, Sep 24, 2012 at 5:20 AM, William Kisman < > wrote: That's not really a problem, the file will be generated on the first run. Sorry about the confusing error message. You don't have commons-compress on your classpath. Try Maven instead. This should work, there were typos in your previous commands: cd dump /run extraction extraction.default.properties Also, in extraction.default.properties, you should delete AbstractExtractor and ImageExtractor. AbstractExtractor doesn't work without a local installation of Wikipedia, and ImageExtractor doesn't work without the Wikipedia commons dump file. Cheers, JC" "The problem about API of virtuoso JDBC3.0" "uHi Jiusheng, Their appears to be a typo in the online documentation which references the \"getRdfObject\" method when you should just use the normal Java \"getObject\" method to retrieve the data and then use the \"instanceof\" method to determine its type. I shall get the typo corrected, apologies for the confusion Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 4 Sep 2008, at 10:33, jiusheng chen wrote: uHi Hugh, Please point us the link once you update it. Thank you. On Thu, Sep 4, 2008 at 8:25 PM, Hugh Williams < >wrote:" "fields on dbpedia but not on wikipedia infobox template?" "uOn of 1849, but I don't see \"1849\" delimited as a single value anywhere on dbpedia values are based on wikipedia infobox values, but it looks like there is more to it than that. How did it end up as a field on the dbpedia page? (And in what is probably a related question, why isn't this field on other pages about topics of the same class" "Bls: Abstract extraction problem" "uHi Dimitris and Jona, Thanks for your reply. I found the problem. I forgot to configure the proxy in extraction-framework/pom.xml.   Regards, Riko Dari: Dimitris Kontokostas < > Kepada: riko adi prasetya < > Cc: \" \" < > Dikirim: Senin, 4 Maret 2013 22:06 Judul: Re: [Dbpedia-discussion] Abstract extraction problem Hi Riko, I updated the settings in the repository (although I don't think this is it) but can you pull and retry? If the problem persists, can you try to debug it and see where exactly in the retrievePage() function is the problem? e.g. test the generated url and see what you get Best, Dimitris On Mon, Mar 4, 2013 at 2:54 PM, riko adi prasetya < > wrote: WARNING: error processing page 'title=Daftar negara bagian di Jerman;ns=0/Main/;language:wiki=id,locale=in' java.lang.Exception: Could not retrieve abstract for page: title=Daftar negara bagian di Jerman;ns=0/Main/;language:wiki=id,locale=in at org.dbpedia.extraction.mappings.AbstractExtractor.retrievePage(AbstractExtractor.scala:134) at org.dbpedia.extraction.mappings.AbstractExtractor.extract(AbstractExtractor.scala:66) at org.dbpedia.extraction.mappings.AbstractExtractor.extract(AbstractExtractor.scala:21) at org.dbpedia.extraction.mappings.CompositeMapping$$anonfun$extract$1.apply(CompositeMapping.scala:13) at org.dbpedia.extraction.mappings.CompositeMapping$$anonfun$extract$1.apply(CompositeMapping.scala:13) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:239) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:239) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59) at scala.collection.immutable.List.foreach(List.scala:76) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:239) at scala.collection.immutable.List.flatMap(List.scala:76) at org.dbpedia.extraction.mappings.CompositeMapping.extract(CompositeMapping.scala:13) at org.dbpedia.extraction.mappings.RootExtractor.apply(RootExtractor.scala:23) at org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1.apply(ExtractionJob.scala:29) at org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1.apply(ExtractionJob.scala:25) at org.dbpedia.extraction.util.SimpleWorkers$$anonfun$apply$1$$anon$2.process(Workers.scala:23) at org.dbpedia.extraction.util.Workers$$anonfun$1$$anon$1.run(Workers.scala:131) uHi Dimitris, I use my campus' internet connection that must use proxy. So, i must configure it in extraction-framework/dump/pom.xml.  I configure it like this, extraction org.dbpedia.extraction.dump.extract.Extraction -server -Xmx1024m -Dhttp.proxyHost=152.118.24.10 -Dhttp.proxyPort=8080 -Dhttp.nonProxyHosts=\"localhost|152.118.*.*|*.ui.ac.id\" Before I solved this problem, I found some kind of message error : - java.net.ConnectException: Connection timed out - java.net.SocketException: Invalid argument or cannot assign requested address - java.net.UnknownHostException: www.w3.org - java.lang.Exception: Could not retrieve abstract for page: title=Daftar filsuf;ns=0/Main/;language:wiki=id,locale=in I have sent pull request. Thank you Dimitris and Jona   Regards, Riko Dari: Dimitris Kontokostas < > Kepada: Riko Adi Prasetya < > Cc: Jona Sahnwaldt < >; \" \" < >; Jose Emilio Labra Gayo < > Dikirim: Selasa, 5 Maret 2013 23:09 Judul: Re: [Dbpedia-discussion] Abstract extraction problem Hi Riko, We had similar (proxy) problems in the past but we didn't documented them anywhere.Would you mind writing how you bypassed the proxy issue? You could make a pull request with your proxy-pom configuration (as a comment) and drop a couple of lines explaining it here: And of course, you can also add everything that you had to figure out on your own:) Thanks Dimitris On Tue, Mar 5, 2013 at 5:36 PM, Riko Adi Prasetya < > wrote: Hi Dimitris and Jona, WARNING: error processing page 'title=Daftar negara bagian di Jerman;ns=0/Main/;language:wiki=id,locale=in' java.lang.Exception: Could not retrieve abstract for page: title=Daftar negara bagian di Jerman;ns=0/Main/;language:wiki=id,locale=in at org.dbpedia.extraction.mappings.AbstractExtractor.retrievePage(AbstractExtractor.scala:134) at org.dbpedia.extraction.mappings.AbstractExtractor.extract(AbstractExtractor.scala:66) at org.dbpedia.extraction.mappings.AbstractExtractor.extract(AbstractExtractor.scala:21) at org.dbpedia.extraction.mappings.CompositeMapping$$anonfun$extract$1.apply(CompositeMapping.scala:13) at org.dbpedia.extraction.mappings.CompositeMapping$$anonfun$extract$1.apply(CompositeMapping.scala:13) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:239) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:239) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59) at scala.collection.immutable.List.foreach(List.scala:76) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:239) at scala.collection.immutable.List.flatMap(List.scala:76) at org.dbpedia.extraction.mappings.CompositeMapping.extract(CompositeMapping.scala:13) at org.dbpedia.extraction.mappings.RootExtractor.apply(RootExtractor.scala:23) at org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1.apply(ExtractionJob.scala:29) at org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1.apply(ExtractionJob.scala:25) at org.dbpedia.extraction.util.SimpleWorkers$$anonfun$apply$1$$anon$2.process(Workers.scala:23) at org.dbpedia.extraction.util.Workers$$anonfun$1$$anon$1.run(Workers.scala:131)" "The LAST INVITATION for CSCEET2017, Beirut, Lebanon - April 26, 2017" "u[Apologies for cross-posting. Please forward to anybody who might be interested.] The Fourth International Conference on Computer Science, Computer Engineering, and Education Technologies [CSCEET 2017] April 26-28, 2017 - Beirut, Lebanon Website: Paper Due : Extended to March 16, 2016 To submit your paper: TOPICS: * Information and Data Management   * Social Networks * Data Compression   * Information Content Security * E-Technology   * Mobile, Ad Hoc and Sensor Network Management * E-Government   * Web Services Architecture, Modeling and Design * E-Learning   * Semantic Web, Ontologies * Wireless Communications   * Web Services Security * Mobile Networking, Mobility and Nomadicity   * Quality of Service, Scalability and Performance * Ubiquitous Computing, Services and Applications   * Self-Organizing Networks and Networked Systems * Data Mining   * Data Management in Mobile Peer-to-Peer Networks * Computational Intelligence   * Data Stream Processing in Mobile/Sensor Networks * Biometrics Technologies   * Indexing and Query Processing for Moving Objects * Forensics, Recognition Technologies and Applications   * Cryptography and Data Protection * Information Ethics   * Peer-to-Peer Social Networks * Fuzzy and Neural Network Systems   * Mobile Social Networks * Signal Processing, Pattern Recognition and Applications   * User Interfaces and Usability Issues form Mobile Applications * Image Processing   * Sensor Networks and Social Sensing * Distributed and parallel applications   * Social Search * Internet Modeling   * Embedded Systems and Software * User Interfaces, Visualization and Modeling   * Real-Time Systems * XML-Based Languages   * Multimedia Computing * Network Security   * Software Engineering * Remote Sensing WE ARE SINCERELY LOOKING FORWARD TO SEE YOU IN BEIRUT IN APRIL 2017 [Apologies for cross-posting. Please forward to anybody who might be interested.] The Fourth International Conference on Computer Science, Computer Engineering, and Education Technologies [CSCEET 2017] April 26-28, 2017 - Beirut, Lebanon Website:" "Categories without labels in dbpedia 3.2?" "uI've been looking at the wikipedia data and noticed the following issue. There seem to be categories in articlecategories_en that don't exist in categories_label_en, for instance If I look in the label file, $ bzcat ~/dbpedia3.2/categories_label_en.nt.bz2 | grep The_Like_Young I only find \"The Like Young songs\"@en . which doesn't match. I found about 31,695 cases like this. I could either ignore these categories or make up labels for them from looking at the URLs, but it may point to a deeper problem. I'm also thinking about enclosure relationships between categories: If I look at wikipedia, I find pages like: Note that Chemistry contains subcategories such as Perhaps I'm missing something, but I don't see subcategory relationships kept track of in wikipedia. I know that wikipedia categories are pretty messy, but I've found that graph traversals & filtering can be applied to them to find members of classes that slip through the cracks of more rigorous taxonomies uPaul, On 10 Apr 2009, at 02:11, Paul Houle wrote: I'm not sure I get your point. The relationship between the categories “Chemistry” and “Acid-base chemistry” is present both in Wikipedia and in DBpedia. In Wikipedia, it can be seen as the first subcategory on the Category:Chemistry page. In DBpedia, it is expressed as a triple using the skos:broader predicate, which can be found in the “Categories (Skos)” dump: uRichard Cyganiak wrote: Thanks for the tip. I see it now. The categories with missing labels are a real problem. My system makes me aware of things like that, but joins could silently fail for people who are using other systems." "Last Mile: The 9th International and Interdisciplinary Conference on Modeling and Using Context (CONTEXT 2015)" "uLast Mile The 9th International and Interdisciplinary Conference on Modeling and Using Context (CONTEXT 2015) 2-6 November 2015, Lordos Beach Hotel, Larnaca, Cyprus Deadline: June 15, 2015 The CONTEXT conferences are the world's prime forum for presentation and exchange of insights and cutting-edge results from the wide range of disciplines concerned with context. The main theme of CONTEXT 2015 is \"Back to the roots\", focusing on the importance of interdisciplinary cooperations and studies of the phenomenon. Context, context modeling and context comprehension are central topics in linguistics, philosophy, sociology, artificial intelligence, computer science, art, law, organizational sciences, cognitive science, psychology, etc. and are also essential for the effectiveness of modern, complex and distributed software systems. CONTEXT 2015 invites high-quality contributions from researchers and practitioners in foundational studies, applications and evaluations of modeling and use of context in all relevant fields. Areas of interest include, but are not limited to, the role of context seen from different perspectives in: · Agent-based architectures · Ambient intelligence · Cognition and perception by humans and artifacts · Context-aware and situated systems · Context modeling tools · Communication and dialogue · Data analysis and visualization · Decision making · Discourse comprehension and representation · Engineering, e.g., in transport networks, industrial plants etc. · Experimental philosophy and experimental pragmatics · (Formal) models of context · Human-computer interaction · Knowledge representation · Language acquisition and processing · Learning, knowledge management and sharing · Logic and reasoning · Machine learning · Ontology/ies · Semantics and Pragmatics · Smart and interactive spaces · Understanding art, images, music and theatre Proceedings Accepted papers and poster abstracts will be published in a volume of the Springer LNAI series. Submission format Submissions may be either full papers of up to 14 pages (in Springer LNCS format) or poster abstracts of 4-6 pages. Full papers may be accepted as such with oral presentation, or their authors may be invited to prepare a poster abstract. Detailed formatting and submissions instructions will be provided. Conference events CONTEXT 2015 will include paper presentation sessions, a poster and demonstration session, two days of workshops, and a doctoral consortium as well as keynote talks and a panel discussion. Workshops and the doctoral consortium will circulate separate calls for papers and participation, which will also be available at the conference web site. All accepted authors will have the option of presenting a system demonstration at the poster session. Keynote talks · Emma Borg, University of Reading, UK Linguistic Meaning, Context and Assertion · Enrico Rukzio, Universität Ulm, Germany Mobile Interaction with Pervasive User Interfaces Workshops CONTEXT 2015 workshops will provide a platform for presenting novel and emerging ideas in the use and the modelling of context in a less formal and possibly more focused way than the conference itself. Three workshops are confirmed so far: · Smart University 3.0 · CATI - Context Awareness & Tactile design for mobile Interaction · SHAPES 3.0" "Extended deadline for applying PhD positions of WDAqua project" "uI apologise for cross-posting. EIS research group at the university of Bonn in Germany is pleased to announce the extended deadline (till 28th February) for: *Three full time PhD positions in the field of question answering on interlinked datasets* The successful candidates will carry out research towards a PhD on one of the following topics: (1) Effectively and efficiently supporting dataset maintainers in semi-automatically cleaning and enriching datasets to make them fit for question answering. (2) Discovering and retrieving datasets of a high quality for QA with a high precision and recall; automating dataset quality assessment. (3) Efficient and precise translation of natural language questions into structured queries against a federated knowledge base. The successful candidates will work in the Marie Skłodowska-Curie International Training Network WDAqua (Answering Questions using Web Data), undertaking advanced fundamental and applied research into models, methods, and tools for data-driven question answering on the Web, spanning a diverse range of areas and disciplines (data analytics, social computing, Linked Data, and Web science). They will participate in the network’s training schedule and perform internships with other project partners. There will be no teaching obligation but teaching can be arranged, if desired. On top of a generous basic salary we offer a mobility allowance and, if applicable, a family allowance. Initial appointment will be for one year with a possible extension to 3 years. The position starts as soon as possible. Initial appointment will be for one year with a possible extension to 3 years. *We offer:* - Two or more academic supervisors from the WDAqua network (the PhD student working on topic (1) will obtain a dual PhD degree with Université Jean Monnet Saint-Étienne, France) - A support grant to attend conferences, summer schools, and other events related to your research. - The possibility to obtain a discounted public transport ticket. *Requirements* - A Master degree in Computer Science or equivalent. - Not having resided/worked in Germany for more than 12 months in the 3 years before starting the work (Marie Skłodowska-Curie mobility rule) - Proficiency in spoken and written English. Proficiency in German is a plus. - Proficiency in Programming languages like Java/Scala or JavaScript, and modern software engineering methodology. - Familiarity with Semantic Web, Natural Language Processing, Indexing and Data Analytics is an asset. *How to apply* To apply, please send the following documents in pdf to before 28th February 2015. - CV - Master certificate or university transcripts - A motivation letter including a short research plan targeted to one of the topics listed above, - Two letters of recommendation - English writing sample (e.g. prior publication or master thesis excerpt). Please direct informal enquiries to the same address or see * The University of Bonn is an equal opportunities employer. Preference will be given to suitably qualified women or persons with disabilities, all other considerations being equal." "up-to-dateness of DBpedia content" "uDear DBpedia developers, I'm currently trying to use some information from DBpedia and got stuck with some drawbacks: - All available Links to logos are invalid. I tried content of dbpedia-owl:thumbnail as well as foaf:depiction. After a short chat with Sören I also tested live.dbpedia.org - unfortunately with the same result :-( - Many information from en.wikipedia.org I don't find at all, e. g. Revenue or Profit. Could you please inform how often the content of DBpedia (with and without \"live.\") gets updated and when I should recheck the issues above? Thx, Christian uChristian, The data under entire Wikipedia is re-extracted completely from scratch. However, the data from The data is extracted by the DBpedia Extraction Framework ( extracted, usually based on custom extractors (for images for example) or based on community-generated mappings of Wikipedia infoboxes ( for example the extractors in [1] Revenue is indeed in the infobox company, and it is mapped: I tried this query: And noticed it's been extracted for 7764 cases [2]. (e.g. So it should have been extracted for Apple Inc as well. I noticed that there is a link alongside the value in the infobox markup, so that may have broken the extractor. I don't know. You may want to dig into the source code to try and figure it out. Also, this list of open source volunteers tends to be quite helpful if asked nicely. So maybe somebody may be able to pick up where I left. Cheers, Pablo [1] [2] http://mappings.dbpedia.org/server/templatestatistics/en/Infobox_company/ On Tue, Oct 25, 2011 at 3:37 PM, Christian Ehrlich < > wrote: uAm 25.10.2011 15:37, schrieb Christian Ehrlich: Looks like this is some namespace problem: This is the extracted link: That's the one which works: We are looking into the problem right now why this doesn't show up correctly on the live instance. Meanwhile you could just do a replace '/commons/'=>'/en/' as a work around. Sören uAny update on this issue? I don't see any improvement. For and link for foaf:depiction and also the workaround mentioned below does not work any more. Did you find the reason for this? Thx, Christian uHi Christian , first of all, sorry for the belated answer. On 12/07/2011 01:51 PM, Christian Ehrlich wrote: This issue is now fixed in DBpedia-Live so if you use values for dbpedia-owl:thumbnail and foaf:depiction. By time, you will find the correct values of those predicates for the other resources as well, as more articles are processed. Hope that solves your problem." "DBpedia datasets and their encodings" "uHello, Can you please explain me the encoding procedures that DBpedia uses for the datasets? It seems that everything was encoded from MacRoman, which is not a good thing for people working on non-Mac machines (I develop on a Mac, but the production environment will be a Linux, who does not have the slightest idea what is the MacRoman encoding). Take for instance the DBpedia resource of the writer 'José Saramago', represented as resource Jos%C3%A9_Saramago> on the datasets. I normally work on UTF-8, and I had to write a little script (in the end of the mail) to figure it out the encoding used, giving: Encoding José Saramago in ISO-8859-1: Jos%3F%A9+Saramago Decoding Jos%C3%A9_Saramago in ISO-8859-1: José Saramago uNuno, On 2 Apr 2009, at 12:44, Nuno Cardoso wrote: We use UTF-8 everywhere. There is no MacRoman in DBpedia. Your text editor saves your source files in UTF-8, but your compiler/interpreter interprets them in MacRoman. That gives rise to the funny effects you are seeing. > So, I must force MacRoman to properly encode entities to DBpedia> SELECT * WHERE { ?p ? That's UTF-8, not MacRoman. MacRoman would be Jos%8E_Saramago. MacRoman is a single-byte encoding, thus the fact that the single character 'é' has been encoded as two octets '%C3%A9' should already tell you that you're not looking at MacRoman. The fact that US-ASCII characters remain unencoded while other characters are multi-byte encoded is a very strong clue that you're looking at UTF-8. I hope you realize how ironic your comments are! We use exactly the same encoding as Wikipedia: There is no such thing as an unencoded URI. You simply cannot have a character like é in a URI. When you enter into your browser, then your browser automatically encodes the URI (using UTF-8 follow by ) before sending it to the server. That's why it *appears* like unencoded URIs work in your browser. In reality, they wouldn't even be valid URIs. It is needed because RDF property names that contain the percent character cannot be serialised in RDF/XML. In general, to get from URIs to human-readable strings, don't mess around with the URI, but find the rdfs:label property of the resource. This advice holds for most RDF data. In DBpedia, it works for all instances (in the /resource/ namespace), and it works for all properties from the english-language infobox dataset. It does currently *not* work for properties in other languages, because their labels are not loaded into the DBpedia RDF store. But they are available for download. The portugiese one is here: Unfortunately, the labels in the dump are badly broken: \"popula_percent_ e3_percent_ a7_percent_ e3_percent_ a3o\" . The literal should read \"população\". It's a bug. (It would probably be a good idea for the DBpedia admins to load those dumps into the store after the bug has been fixed.) We already use UTF-8 everywhere. We cannot fix \"charset hell\" with new dataset releases. \"Charset hell\" exists because most developers don't care about Unicode or character encodings, even though they should. Internationalization is hard, unfortunately. Best, Richard uHello Richard, Yup, I realize that the problem/confusion is more on my side. Anyway, thanks by clearing out stuff. I've worked around a little more, and I have two comments: - The best approach is, in fact, as you said, to use the rdfs:label property of the resource. Regardless of the encoding hell that's happening on URIs, this should avoid it on SPARQL queries. - The properties indeed have a bug like you noticed. I'm sort of wrapping properties into objects with my own labels to tackle this problem. Are you planning a release soon? Cheers, Nuno Cardoso, PhD Student. www.tumba.pt - Search on the Portuguese Web! www.linguateca.pt - Distributed Resource Center for Portuguese Language Processing On Fri, Apr 3, 2009 at 16:23, Richard Cyganiak < > wrote:" "Bart Simpson chalkboard gags on Dbpedia" "uHi all, Bob DuCarme wrote a very nice DBpedia query to retrieve all chalkboard gags by Bart Simpson: SELECT ?episode,?chalkboard_gag WHERE { ?episode skos:subject . ?episode dbpedia2:blackboard ?chalkboard_gag } See for details It is always suprising again what kind of stuff you can retrieve from DBpedia :-) Have a nice weekend. Chris uon [1], bob ducarme wrote: he actually stumbled upon a bug in the extractor: [2] (the category of all season 12 episodes) states that it has skos:broader dbpedia:Category:The_Simpsons_episodes%7C12 which comes from the mediawiki notation of assigning a page to a category by inserting [[Category:TheCategory|letter]] where letter is used for sorting inside the category page. (\"episodes%7C12\" is \"episodes|12\", and shows up on the parent category under \"letter\" 1) the extractor should simply strip off everything after the first \"|\", then looking for all episodes (and their chalkboard gags) would be possible. regards chrysn [1] [2] which has foaf:page [3] [3] Category:The_Simpsons_episodes%2C_season_12" "primary reference key for resources" "uI'm querying dpedia on my website and also augment each resource with additional properties. I save the additional data to a local database and keep reference to the dpedia resource. In live wikipedia, articles are split, merged or moved all the time which changes their url. How does dpedia handle such events? Is it possible to keep my aditional data in sync with the live dpedia? For example 1. User queries U2 via dpbedia 2. U2 is returned 3. user adds additional data to U2, i.e eye color 4. additional data is saved in local database, with a reference to U2 5. U2 is moved to U2_(band) in dbpedia 6. User queries U2 in dbpedia 7. U2_(band) is returned 8. local database cannot be loaded because U2_(band) is not referenced, only U2 Thanks uHi, I guess you could either analyse the change sets: or try this: or this: Modelling Provenance of DBpedia Resources Using Wikipedia Contributions Sebastian Am 05.09.2011 00:06, schrieb bizso09: uAh yes, and please be so nice and keep us updated about your project and how you solved it. Please email, if there is anything you can and want to contribute to DBpedia: code, data, patches, tools, enlightenment Sebastian Am 05.09.2011 00:06, schrieb bizso09:" "Version 0.2: Add your links to DBpedia workflow" "uDear all, Based on the feedback, I have updated the workflow to add your links to DBpedia: (Thanks to Sarven Capadisli and Søren Roug, who already contributed several thousand links. ) Here are open issues: 1. Metadata and provenance info: First we would need to agree on what kind of metadata, we want to keep and secondly, we probably need an expert to make an example and help us with the vocab. These might be helpful: I would be happy, if we could just say: * this linking file was modified by contributions of the following github users * Then have some property where these users could add their comments and documentation (e.g. how the links were created). * information into which graphs the datasets should be loaded and whether they should be available in the main graph and via Linked Data * maybe some statistics, e.g. which linking properties are used (these can be loaded automatically) So in a nutshell: provenance, attribution, documentation, statistics and access. 2. We need to decide, which kind of data is allowed (see other threads) 3. I would like to add some people to the githib repo and create a *link maintenance committee* . For the beginning, everybody who contributed a dataset, would become its maintainer and merge other contirbutions to the same dataset. Please email me, if you would like to join such a committee. All the best, Sebastian" "limited sparql query results?" "uHello everybody, I'm new to this list. I'm using dbpedia sparql endpoint for my thesis project at university. I'm developing a software that allows the user to easily make queries to the dbpedia sparql endpoint. While trying some queries i noticed that some of them (or all of them?) make dbpedia return a truncated reply. I can check this trying to put an OFFSET that makes me see that other results are there but they are just not returned all at once. Is this something i can achieve? Do you have any hint for me to get all results somehow? Thank you for your time. Regards. uHi Alessandro, By \"truncated\" I presume you mean it is not possible to retrieve more that 1000 rows per query, which is an intentional restriction on the dbpedia sparql endpoint to protect it from mis-use ie someone inadvertently or otherwise overloading the service querying large result setsr. Thus as you have done below, to obtain larger result sets use LiMIT and OFFSET to traverse through the data in chunks if required. Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 1 Mar 2010, at 10:16, Alessandro Diaferia wrote: u2010/3/1 Hugh Williams < > uAlessandro, On 1 Mar 2010, at 10:59, Alessandro Diaferia wrote: u uPersonally I like having a local copy of dbpedia; there are operations that I like to do (temporary closed world assumption, etc.) that are much more reliably done when you've got a local copy. The fact is that there are many kinds of funkiness in dbpedia, including data holes, key integrity issues and so forth that you'll never see if you use the SPARQL endpoint and that I wouldn't have discovered if I hadn't built a specialized system to efficiently represent dbpedia and freebase that enforces tougher integrity constraints than a conventional RDF stores. uPaul Houle wrote:" "From filepage to specific file?" "uHi all! I'm trying to fetch some logo's for at fact box. In the Rdf for I can find the url: and I can get from that one to But how do I get from the file page to a specific image? (pref. a jpg version) Like: Have a wonderfull monday. MVH /Johs. W." "Problems to run a local mirror of the lookup service." "uHello! I'm trying to run a local mirror of the lookup service, following the instructions from the page: lookup/tree/master/src/main/resources. On the instructions to clone and build dbpedia lookup, I do: git clone git://github.com/dbpedia/lookup.git, and then: cd lookup, but when I do: mvn clean install, I get the error: \"Failed to execute goal org.apache.maven.plugins:maven-war-plugin:2.1.1:war (default-war) on project dbpedia-lookup: Error assembling WAR: webxml attribute is required (or pre-existing WEB-INF/web.xml if executing in update mode)\". It seems to be an error about not having a web.xml file, but I've looked the files in isn't really a web.xml file coming with it. I have Maven 3.0.5 installed. Did anybody have the same problem? What can I do? Thanks. Hello! I'm trying to run a local mirror of the lookup service, following the instructions from the page: Thanks." "ORDER BY modifying not just order" "uWhen I run the following query on list of people that looks fine, where every person's name links to its DBpedia page. SELECT ?name ?person WHERE { ?person dbpedia2:birthPlace . ?person foaf:name ?name . } However, if I append \"ORDER BY ?name\" , then only the first two people are linked. The other ones are just quoted strings like this one: \" be a bug in OpenLink Virtuoso's \"SPARQL Explorer\" web interface. By the way, I am trying to fix the Berlin 1900 example that does not work anymore. It seems that the dbpedia2:death does not exist anymore and has been replaced with dbpedia2:died, and the examples have not been updated. I noticed that most examples in the \"Online Access\" page don't work, it is a pity, it used to be quite fun for beginners. Keep up the good work, Nicolas Raoul. uHi Nicolas, thanks for the pointers. any help with online access examples is highly appreciated. it's a wiki, feel free to edit :) Cheers, Georgi" "Meaning of the property wikiPageOutDegree" "uAhoy, I am trying to understand the meaning of this property [1] in DBpedia dataset. Is this number computed and how? Is it related to the TIA. Best, Ghislain [1] uHey Ghis, If I’m not getting wrong, this property is computed during the extraction process and is documented here: datasets#outdegree uGhislain Le 2 août 2016 à 18:24, Julien Plu < > a écrit : Ghislain uOn 02.08.2016 12:28, Julien Plu wrote: Is this the number of links *within* the article that refer to other articles, or the number of *other* articles referencing this article ? (In other words, could this number be used to rank articles ?) Thanks, Stefan uHi Stefan, I think the documentation is clear enough: Number of links emerging from a Wikipedia article and pointing to another Wikipedia article. Then you might use it for a ranking, like PageRank." "mappings with semantic mediawiki" "uhi dbpedia, I'm really excited about being able to contribute to the dbpedia schema. I think using mediawiki to compose dbpedia ontologies is a thoughtful and original approach. it is going slowly. and I always feel like I'm screwing something up. has anyone considered installing semantic mediawiki to allow html forms, and bulk edits? It's a really clever extension that I think could give us a quick visual ontology editor. I made this - when I was thinking that the challenge was finding new wp infoboxes to do, but really, the challenge is recreating the existing ontology right? I'm confident that at least some of this could be done in bulk, if mediawiki easily supported it. is there an irc channel for dbpedia people? warmly, from montreal -spencer uHi, On Sun, Nov 28, 2010 at 5:06 PM, Spencer Kelly < > wrote: With the next DBpedia release, we are going to release a graphical user interface to edit infobox mappings. We hope this will further increase contributions from the community. You can already have a look at the documentation and some screen shots: This list is great! I still see huge potential for DBpedia in creating new mappings for infoboxes. As your list shows, a big amount of infoboxes is not mapped onto the ontology yet. For all this data, there is therefore no *high quality* data in the /ontology/ namespace, but only raw data in the /property/ namespace which is of much lower quality. I am not aware of one. Best, Max" "DBpedia to Wikidata sameAs links" "uHello, Was there any dataset with the DBpedia to Wikidata sameAs links created during the DBpedia 2016-04 release? If it was, why is it not on the list of datasets with links? I managed to find the rdfs:type links to categories in this file , but not the sameAs links. Best regards, Adrian M.P. Brasoveanu Hello, Was there any dataset with the DBpedia to Wikidata sameAs links created during the DBpedia 2016-04 release? If it was, why is it not on the list of datasets with links? I managed to find the rdfs:type links to categories in this file Brasoveanu uHello, the dataset is called interlanguage links and is available for all DBpedia languages Cheers, Dimitris On Mon, Nov 14, 2016 at 4:30 PM, Adrian Brasoveanu < > wrote: u" "DBpedia preview could not be found" "uI'm trying to preview DBpedia datasets before download them. I seems that the preview page is not available; I get 404 page error when I try to access any file preview such as: preview.php?file=2014_sl_en_sl_instance_types_en.nt.bz2 uHi Nasreddine, to quote the DBpedia download page: \"NOTE: Currently, there are no previews available for 2014 release. We are working on fixing this.\" Best, Heiko Am 12.11.2014 um 16:25 schrieb Nasreddine Cheniki:" "DBpedia ontology mapping wiki permission" "uHi, I requested editing privileges on the DBpedia mapping wiki last week but haven't heard anything back. My colleague Erik C. has also requested permission but not heard anything back. Whats happening? We are working to get the infobox data about human genes into DBpedia. (See the data on pages like this: thanks -Ben Hi,  I requested editing privileges on the DBpedia mapping wiki last week but haven't heard anything back.  My colleague Erik C. has also requested permission but not heard anything back. Whats happening? We are working to get the infobox data about human genes into DBpedia.  (See the data on pages like this: -Ben uHi Ben, we granted edit rights to User:Pleiotrope User:Bgood Sorry for the delay. Happy mapping, Anja can you give us the usernames On 11.01.2012 23:54, Benjamin Good wrote: uBenjamin, Are you guys associated with the GeneWiki project? If not, talking to them might be a good way to start. A lot of gene data is pulled into Wikipedia by bots such as Cheers, Pablo On Wed, Jan 11, 2012 at 11:54 PM, Benjamin Good < >wrote: u:) Yes, we are the technical team behind the gene wiki project. Eclarke/pleiotrope is the main author of that bot. Our goal here is to get the structured data flowing into Wikipedia through those infoboxes (and other future projects) flowing out as linked data via DBpedia. -Ben On Jan 20, 2012, at 4:08 AM, Pablo Mendes < > wrote: uVery nice! Do you already have a plan of action? The data seems to sit at one level of indirection away from the article, am I right? Take the example article: It contains: {{PBB|geneid=1017}} But all the data seems to sit on the (transcluded) template: I don' t think the extraction framework can reach such data at the moment. It, by default, goes over the articles and extracts everything there, but it does not parse the templates. Can anybody from the DEF dev team confirm this? Cheers, Pablo On Fri, Jan 20, 2012 at 5:15 PM, Benjamin Good < >wrote:" "extracting properties" "uHi, I'm trying to list the properties used for each class but a query like this select distinct ?Concept where {[] ?Concept ?o. ?o a } LIMIT 100 is too \"heavy\" for the dbpedia SPARQL endpoint and it dosen't work. How can I improve the query? thanks diego Hi, I'm trying to list the properties used for each class but a query like this select distinct ?Concept where {[] ?Concept ?o. ?o a < diego uHi Diego, On 12/24/2012 12:25 PM, Diego Valerio Camarda wrote: If you are interested in the properties in the ontology namespace you can use the following query: SELECT DISTINCT ?property WHERE { ?property rdfs:domain dbpedia-owl:Person } but if you are interested in all properties you should use the following query: SELECT DISTINCT ?property FROM WHERE { ?subject a . ?subject ?property ?value } LIMIT 100 But you it's better to run it against endpoint [1], as it is faster, or against DBpedia-Live endpoint [2]. Hope that helps. [1] [2] uthank you, I just realized that the query I pasted in the email was wrong and that I made some mess with the message I wrote! select distinct ?Concept where {[] ?Concept ?o. ?o a } LIMIT 100 the query represents all the properties referring to Person Class and at the moment it works fine on dbpedia.org/sparql, probably there was some intense traffic on the endpoint when I was trying to use it is thank again, 2012/12/24 Mohamed Morsey < > uHi Diego, On 12/24/2012 02:52 PM, Diego Valerio Camarda wrote: This message thread contains the details you are looking for [1]. [1] msg02993.html" "/resource view displaying RDF" "uHi, Not sure if there's some work going on at the moment but when visiting : redirects to : I think this used to redirect to : Thanks, Rob urobl wrote:" "hosting bg.dbpedia.org" "uWe'd like to host bg.dbpedia.org. Who should we contact, and are there specific procedures we need to follow? Cheers! Vladimir uCCing the mailing list On 12/11/14 1:24 PM, Marco Fossati wrote:" "dbpedia-links: Recommendation for predicate "rdrel:manifestationOfWork" ?" "uHi *, we[1] want to provide links between lobid and dbpedia. We would like to use an other predicate as the recommended ones[2]. lobid describes manifestations of library resources, e. g. books. We find that predicates like \"owl:sameAs\" do not fit , because the dbpedia resources often don't only describe books but also other forms of manifestations (e .g. a play or a movie). Thus we use the predicate rdrel:workManifested[3] to link our manifestations to dbpedia resources, e. g.: . This \"means\" : the lobid-resource is a \"physical embodiment of an expression of a work\", and that work is the dbpedia-resource (while work is defined as \"A distinct intellectual or artistic creation\"). As for the dbpedia_links we would like to use the inverse predicate - that would be rdrel:manifestationOfWork[4] . This would imply that these dbpedia-resources are rdf:type rdrel:Work[5] (we find that quite fitting, although not every wikipedia entry is actually a \"work\" (e .g. [6] is rather a \"manifestation\" - sure someday someone will correct this entry ). What do you think - should rdrel:manifestationOfWork be recommended in our case? -o [1] [2] [3] [4] [5] [6] Five_Go_Off_to_Camp uHi Pascal, I merged your links into the repo. Very good, we will include them in the data base soon. I changed the readme to adjust for your requirements. Basically you have a 1:n mapping and thus you need a domain specific link predicate. I added this and other things to the readme: 1. (strict) All N-Triples files must be alphabetically sorted without duplicate triples for better diffs. This is in accordance with the Unix command: sort -u . 2. For a list of currently used predicates (might be extended easily, write to list), see the file predicate-count.csv - For 1:1 mappings we recommend to use these: owl:sameAs, umbel:isLike, skos:{exact|close|}Match - For 1:m, n:1 or n:m mappings it seems to make sense to use domain-specific properties such as Additionally, you can include types, which result from inference of the usage of the domain-specific linking property, e.g. the rdfs:domain of the property. E.g. rdrel:workManifested is rdfs:domain rdafrbr:Manifestation, which entails that DBpedia entries should be of rdf:type rdafrbr:Manifestation. 3. Note that we also count links to other classes as links, so if you want to add an external classification using rdf:type as linking property, that is fine as well. I also included a predicate count: 880414 437797 330620 11970 10089 9273 2081 312 http://dbpedia.org/ontology/spokenIn Thanks again Pascal, Sebastian Am 03.05.2013 16:15, schrieb Pascal Christoph: uHi Sebastian, Am 06.05.2013 12:54 schrieb Sebastian Hellmann : :) I've adjusted[1] a misunderstanding resulting from my incorrectly first post , where I wrongly stated that: To be correct, it should be \" that the dbpedia entries are frbr:Works (not frbr:Manifestations). (Our provided links, however, were correctly predicated). [] my fault, see above, it should be: \"DBpedia entries should be of rdf:type rdafrbr:Work\" hm, we at hbz are hesitant with that. While we don't want to discuss this here and now we want to make you aware of arising problems: While we think an encyclopedia generally describes a kind of abstract \"work\" rather than a concrete \"manifestation\" (e. g. a concrete edition of a book), we find that others classify dbpedia resources in a different way. E .g, if I understand correctly, the \"bookmashup\" dataset seems , like lobid, to describe *manifestations* of books, not *works*. But the bookmashup links with predicate \"owl:sameAs\" to dbpedia resources. These different approaches might result in a contradictionary assumptions. This is another example why the \"owl:sameAs\" predicate should be used with great care. thank you Sebastian - looking forward to see links from dbpedia to our lod service :) pascal [1] uBookmashup came from an earlier time and maybe it makes sense to rethink owl:sameAs for bookmashup. They are of class foaf:Document. The only problem is, that we need a more general property to query such links. Use case is to query all outgoing links for all books: Current: select * Where { ?s rdf:type . ?s owl:sameAs ?o } If we start replacing owl:sameAs this gets more difficult: select * where { ?s rdf:type . ?s ?link ?o . FILTER ( ! regex(str(?o), \"^ } Pascal, do you know a property, which we can use in addition? rdf:seeAlso or skos:related or umbel:isLike to these match? We could 1. weaken existing owl:sameAs links 2. add a domain specific property. By the way, a foaf:Document is a frbr:Work or a frbr:Manifestation or both uLe 2013-05-06 17:43, Pascal Christoph a écrit : Hello, I'm new on this list so my question will probably sound naive, don't hesitate to just reply with links to relevant documentation. I am wondering why you are using camel case notation. Also it looks like you use directly English label rather than an identifier like a hash or decimal digits. Did you chose explecitly this architecture rather this other possibilty? If yes, what was your criteria to retain this architecture? Did you analyzed repercussions on internationalisation problems, and if yes, where can I find the report on this topic? Kind regards, mathieu uHello, I am Adrian and working with Pascal at the hbz on Linked Open Data stuff, especially lobid.org. I subscribed to this list some days ago. wrote: This sounds reasonable. I don't understand why you'd need such a query if ?o is something like . Or do I get something wrong here? rdfs:seeAlso would definitely be an improvement. skos:related has skos:Concept as rdfs:domain and thus is problematic. And umbel:isLike suggests a closer match than there might be. I think both makes sense. Weaken owl:sameAs to rdfs:seeAlso and allow adding domain-specific properties. or might be a frbr:Item or even a frbr:Expression. It shouldn't be two or more of these, though, as according to the general FRBR model they are disjoint. E.g. the FRBR core ontology (one of the different FRBR ontologies) reads: owl:disjointWith , , . But we better not start a FRBR discussion here. Usually, they absorb much time that is better spent elsewhere and don't lead to anything productive. Having said this, it might be we made a bad choice using these FRBRish properties in lobid.org. But we think, at least the distinction between work, manifestation and item makes sense. Anyway, you should only use of the domain-specific property - Adrian uHi Adrian, thanks for your feedback. I think we should use both: rdf:seeAlso and manifestationOfWork and also add the rdf:type FRBR:Work to DBpedia. I created two issues: I would be grateful, if you had the time to tackle them By the way, having these two triples: rdfs:subClassOf foaf:Document . rdfs:subClassOf foaf:Document . Is definitely not a contradiction, but completely ok. It is the patterns of having disjoint subclassesThis seems to be the correct intention. Sebastian Am 07.05.2013 13:48, schrieb Adrian Pohl:" "Fetching images from Wikipedia" "uThe DBPedia dataset \"Images\" has mappings from wikipedia ids to the actual URL where the image can be downloaded. Example: . However, many of the images have moved around and no longer exist. What is the proper way if I want to download (and locally) cache all (or at least a large number of) images from Wikipedia? I have downloaded the full Wikipedia dump \"enwiki-20080103-pages-articles.xml.bz2\" which do contain all image names (for example, [[Image:Anarchy-symbol.svg|]]\"). But that piece does not give me the actual URL to the image? What is the proper way to download all images? I could always call \" and from there get the actual URL (which in this case happens to be \" and download that image from the URL. This would require two calls to Wikipedia for each image I want to download (one to get the URL, and another to get the image). Is there anyway I can do this without putting such stress to their servers? It would be good if at least there was a way to get all URLs automatically so I do not have to do two calls but only one call per image. I have by the way downloaded the \"enwiki-20080103-image.sql.gz\" dataset, but it only contains meta data about the images, not the URLs so I can fetch them. Since DBPedias \"Image\" dataset contains all URLs I assume there is someway to obtain all the URLs in a batch. Or has the DBPedia team also fetched the URLs by doing one call per image to \" Thanks /Omid uOmid, See this thread from the mailing list archives for a summary of what we know about image (and audio file) URLs in Wikipedia: Richard On 4 Mar 2008, at 05:32, Omid Rouhani wrote: uThanks Richard, That was exactly what I had hoped to find! Thanks /Omid On Tue, Mar 4, 2008 at 3:10 AM, Richard Cyganiak < > wrote:" "dbpedia -> Freebase mappings, a cautionary tale" "uIn our last episode, I was trying to construct a list of countries that are 'going concerns'; that is, not Austria-Hungary. I decided a reasonable approximation could be had by creating a 1-1 mapping between dbpedia and country codes. It's easy to get iso country codes out of Freebase (one MQL query returns them all,) so then it's a simple matter of using the Fb <-> Dbpedia mappings to send them back to freebase. Wrong. After I took out redirects, I found that this procedure mapped ISO country codes to multiple pages in dbpedia for 90 countries; if these were all obscure places, that would be one thing, but fbase:/en/japan mapped to a large number of topics in freebase, even after removing redirects Here's some raw triples from the NT dump just so you get the idea: . . . . . There are a total of 44 of these in the dump. A lot of these are redirects to dbpedia:Japan (easy to zap) and they all make sense as being aspects or components of 'Japan', even in some of them are a little batty, like \"Area 11\". For me, there's the immediate problem that, if I want to key my items to dbpedia, I need to establish (more or less) and that's it. So far as end users are concerned, there's got to be some single entity that corresponds to a country, and that entity has to be correct. Now, for me, the practical answer to use fbase:/en/japan as my key. Those of you who are using OWL reasoning, however, had better note that adding the freebase->dbpedia mappings to your copy of dbpedia will considerably change the meaning of your database." "Question Answering over DBpedia" "uHi all! We have set up a Question Answering demo over DBpedia 3.6 (based on Virtuoso). The coverage is still limited (focused more in recall rather than precision), but it can already find potential answers for hundreds of simple factual queries. A sample of Natural Language queries over DBpedia, which may be useful for some people in this forum, can be found in the demo site. We are interesting on collecting more user queries to improve and evaluate its performance and I can't think of a better forum to ask for it :-) If you are interested in exploring NL questions that can be answered by DBpedia, we are logging your queries: Many thanks! PS. The current interface is primarily for people exploring the technology. You can browse all the individual ontological triples that may be relevant to a user query (from where answers are derived) or the merged and ranked answers. Please note that depending on the query, you may have to wait a while for an answer! . Vanessa Lopez, Research associate at Knowledge Media Institute The Open Universtiy Milton Keynes, UK" "dbpedia lookup service unavailable" "uHi, I have just released a new version of LodLive ( uses dbpedia lookup service to make an user able to find a resource on dbpedia but the service is down (from yesterday) do you have any ideas on when it will be available again? thanks, diego valerio camarda Hi, I have just released a new version of LodLive ( camarda uHi Diego, I have restarted the service. For you and others that have services running on top of it, please consider hosting a mirror if you'd like to be in charge of its availability: Cheers, Pablo On Sun, Dec 2, 2012 at 11:53 AM, Diego Valerio Camarda < >wrote: uthanks! I will consider very seriously to host the service, bye, diego 2012/12/2 Pablo N. Mendes < >" "Open Position for Graduate Fellowship in Semantic Web Technologies at CNR-STLAB" "uTITLE: Open Position for Graduate Fellowship in Semantic Web Technologies at CNR-STLAB Important notice: send by February 19th to your CV, a motivational statement, and the contact of at least one referee (or a recommendation letter). The official procedure for being admitted to the selection is described on the official application post (see below) and the deadline is February 27th. Topic: Theories and methods for knowledge extraction and representation at a web scale and their application to cultural heritage and eGovernment. Type of Grant: Graduate Fellowship Employer: Institute of Cognitive Science and Technologies of CNR Salary: EUR 19.367,00 (nineteen-thousand-three-hundred-sixtyseven/00) net of expenses in charge of CNR. Starting from: April 2014 Duration: 12 months Location: Rome, Italy Official application deadline: 27 February, 2014 Link to the official application post: Scientific responsible: Dr. Valentina Presutti Contact person (administrative issues): Type of Grant: Graduate Fellowship There will be a public selection procedure, based on qualifications and an interview, for the assignment of n. 1 (one) - Graduate Fellowship in order to conduct research related to the Scientific Area Information Sciences AND Computer Sciences at the Institute of Cognitive Sciences and Technologies, CNR, in the scope of the projects: eGovernment, Digital Libraries and Hermes, under the scientific responsibility of Dr. Valentina Presutti. To the selection may apply individuals who, whatever their nationality or age, are in possession of the following requirements at the date of expiry of the deadline for submission of applications: a) Degree in Computer Science or Engineering or Literature and Philosophy in accordance with the legislation in force before DM 509/99 or Degree in Computer Science, Engineering, Linquistics or Science of Language (or equivalent) in accordance with the regulations referred to in DM 509/99 or Master's Degree in Computer Science, Engineering Linguistics or Science of Language (or equivalent) in accordance with the regulations referred to in DM 270/04, with professional resume suitable for the conduct of research according to the specifications given in the following points (the candidate is in charge, penalty of exclusion, of demonstrating equiparation of graduation diplomas); b) All qualifications obtained abroad (bechelor's degree, doctorate, and any other qualification) shall be previously recognized in Italy in accordance with current legislation (information on the website of the Ministry of University and Scientific Research: www.miur.it. The equivalence of those diplomas obtained abroad who have not already been recognized in Italy with the expected formal procedure above, will be evaluated, with the only purpose of the present selection, by the Examining Committee constituted according to art. 6, paragraph 1 of the Regulations; c) Documented experience of research, development and application of semantic technologies. In particular, it is required expertise in at least one of the following areas: ontology design and open data, knowledge representation, and natural lanugage processing; d) Excellent knowledge of OWL, RDF, and SPARQL; e) Knowledge of mobile application development platforms; f) Knowledge and documented experience of Java development; preference will be given to candidates that know also other programming languages; g) English proficiency. THE ENGLISH CALL ON THE WEB-SITE DOES NOT HAVE LEGAL VALUE IN ITSELF, AND THUS DOES NOT SUPERSEDE THE ITALIAN VERSION OF THE CALL ANNOUNCEMENT (BANDO). uTITLE: Open Position for Graduate Fellowship in Semantic Web Technologies at CNR-STLAB Important notice: send by March 30th to your CV, a motivational statement, and the contact of at least one referee (or a recommendation letter). The official procedure for being admitted to the selection is described on the official application post (see below) and the deadline is April 9th. Topic: Theories and methods for knowledge extraction and representation at a web scale and their application to cultural heritage and eGovernment. Type of Grant: Graduate Fellowship Employer: Institute of Cognitive Science and Technologies of CNR Salary: EUR 19.367,00 (nineteen-thousand-three-hundred-sixtyseven/00) net of expenses in charge of CNR. Starting from: April 2014 Duration: 12 months Location: Rome, Italy Official application deadline: 27 February, 2014 Link to the official application post: Contact person (administrative issues): Scientific responsible: Dr. Valentina Presutti Type of Grant: Graduate Fellowship There will be a public selection procedure, based on qualifications and an interview, for the assignment of n. 1 (one) - Graduate Fellowship in order to conduct research related to the Scientific Area Information Sciences AND Computer Sciences at the Institute of Cognitive Sciences and Technologies, CNR, in the scope of the projects: eGovernment, Digital Libraries and Hermes, under the scientific responsibility of Dr. Valentina Presutti. To the selection may apply individuals who, whatever their nationality or age, are in possession of the following requirements at the date of expiry of the deadline for submission of applications: a) Degree in Computer Science or Engineering or Literature and Philosophy in accordance with the legislation in force before DM 509/99 or Degree in Computer Science, Engineering, Linguistics or Science of Language (or equivalent) in accordance with the regulations referred to in DM 509/99 or Master's Degree in Computer Science, Engineering Linguistics or Science of Language (or equivalent) in accordance with the regulations referred to in DM 270/04, with professional resume suitable for the conduct of research according to the specifications given in the following points (the candidate is in charge, penalty of exclusion, of demonstrating equiparation of graduation diplomas); b) All qualifications obtained abroad (bachelor's degree, doctorate, and any other qualification) shall be previously recognized in Italy in accordance with current legislation (information on the website of the Ministry of University and Scientific Research: www.miur.it). The equivalence of those diplomas obtained abroad who have not already been recognized in Italy with the expected formal procedure above, will be evaluated, with the only purpose of the present selection, by the Examining Committee constituted according to art. 6, paragraph 1 of the Regulations; c) Documented experience of research, development and application of semantic technologies. In particular, it is required expertise in at least one of the following areas: ontology design and open data, knowledge representation, and natural language processing; d) Excellent knowledge of OWL, RDF, and SPARQL; e) Knowledge of mobile application development platforms; f) Knowledge and documented experience of Java development; preference will be given to candidates that know also other programming languages; g) English proficiency. THE ENGLISH CALL ON THE WEB-SITE DOES NOT HAVE LEGAL VALUE IN ITSELF, AND THUS DOES NOT SUPERSEDE THE ITALIAN VERSION OF THE CALL ANNOUNCEMENT (BANDO)." "how to host URIs?" "uHello, I am a student and I have a project for getting my bachelor degree, the project is to create a linked dataset (for scientific thesis) and publish it on the web of data using an ontology, I've already build the structure of the Ontology using Protege and I exported it as an RDF/XML file, and now I don't know what the next step is, I need to host every URI I have so that I can call this linked data and then create a SPARQL endpoint (and a beautiful website if I have time), I searched for free APIs and frameworks who offer this service but it looks like I'll have to have my URIs hosted first, Please tell me how Dbpedia can help, I heard about the Dbpedia framework but I honestly don't really know how it works. I really need help, this Ontology and linked data world was unknown to me before I start the project I am kind of late I only have one month to finish everything. I hope that I will get an answer soon. Thank you in advance. Celia Ouabas. Hello, I am a student and I have a project for getting my bachelor degree, the project is to create a linked dataset (for scientific thesis) and publish it on the web of data using an ontology, I've already build the structure of the Ontology using Protege and I exported it as an RDF/XML file, and now I don't know what the next step is, I need to host every URI I have so that I can call this linked data and then create a SPARQL endpoint (and a beautiful website if I have time), I searched for free APIs and frameworks who offer this service but it looks like I'll have to have my URIs hosted first, Please tell me how Dbpedia can help, I heard about the Dbpedia framework but I honestly don't really know how it works. I really need help, this Ontology and linked data world was unknown to me before I start the project I am kind of late I only have one month to finish everything. I hope that I will get an answer soon. Thank you in advance. Celia Ouabas. uHi Celia, have you tried to host your URIs using Virtuoso server ? , Virtuoso is the server that DBpedia uses to host it's linked data. it also provides a Sparql endpoint [2] and API that you can use to run sparql queries from code. for general questions about semantic web technologies you might have more help on semanticweb.com Q/A [3] website. Regards 1- 2- 3- On Mon, Mar 24, 2014 at 1:41 AM, Celia OUABAS < >wrote:" "DBPedia mapping extractor for resources not available in English" "uHi all, We are exerimenting with the server module of the extraction framework to convert a Dutch Wikipedia page to the DBPedia ontology. Therefore, we created a few mappings for the Dutch (nl) language. We tested the mappings by calling the extractor from the server (running locally). This works for most resources, but not for those who have no equivalent page in English. An example: - The resource Marleen_Merckx (a Flemish actrice) exists in both the English and Dutch Wikipedia. Extraction runs fine and uses the mapping Mapping_nl:Infobox_acteur. - The resource Pol_Goossen (a Flemish actor) exists only in the Dutch Wikipedia. Extraction doesn't work on this page, even if the infobox is present. Is this a known issue? Is it possible to prevent this behavior, so extraction also works when the English resource is not available on Wikipedia? Best regards, Karel Hi all, We are exerimenting with the server module of the extraction framework to convert a Dutch Wikipedia page to the DBPedia ontology. Therefore, we created a few mappings for the Dutch (nl) language. We tested the mappings by calling the extractor from the server (running locally). This works for most resources, but not for those who have no equivalent page in English. An example: The resource Marleen_Merckx (a Flemish actrice) exists in both the English and Dutch Wikipedia. Extraction runs fine and uses the mapping Mapping_nl:Infobox_acteur. The resource Pol_Goossen (a Flemish actor) exists only in the Dutch Wikipedia. Extraction doesn't work on this page, even if the infobox is present. Is this a known issue? Is it possible to prevent this behavior, so extraction also works when the English resource is not available on Wikipedia? Best regards, Karel uHi, the issue you are referring is one of a few in the dbpedia internationalization effort, see [1] There is currently an effort to create a more flexible extraction framework (but it might take while) meanwhile we (people on the page) have developed modified versions to deal with them locally if you are interested to any of them, you are free to ask for an unofficial patch ;) if you want to be part of this effort, you are welcome to join the group, write your name and give any new ideas cheers, Jim [1] On Wed, Jan 12, 2011 at 3:31 PM, karel braeckman < >wrote:" "getting graph of data" "uHi, we'd like to take data as a graph for better query performance and usage of ontology.However, we couldn't find how to do it in DBPedia. DBPedia provides data only with a sparql end point. How can we achieve that ?? Hi, we'd like to take data as a graph for better query performance and usage of ontology.However, we couldn't find how to do it in DBPedia. DBPedia provides data only with a sparql end point. How can we achieve that ?? uhope I understand the question correct. DBpedia allows you to download a collection of datasets from here: and these are in N-Triple format (not RDF/XML). You should be able to find short description about each dataset from the above page as well. On Wed, Mar 2, 2011 at 4:53 AM, Erdem Begenilmis < u uI guess you are looking for SPARQL CONSTRUCT statement. Reference: This gives you the possibility to get \"personalised ontologies\" as result (= whole rdf statements). With SELECT in contrast you always get results of tables (in your query it's a table with one column). Here two examples: CONSTRUCT { ?prop category:Presidency_of_Barack_Obama} WHERE { ?prop category:Presidency_of_Barack_Obama} CONSTRUCT {?s ?prop category:Presidency_of_Barack_Obama. ?s \"I like it\".} WHERE {?s ?prop category:Presidency_of_Barack_Obama} With the Sparql Endpoint you can even choose a desired result format (at \"RDF/XML\". Jena should be able to import \"RDF/XML\" serialized strings. Greets, Benjamin Am 03.03.2011, 23:06 Uhr, schrieb Erdem Begenilmis < >:" "wikipedia errors and dbpedia refresh period" "uI'm currently writing a simple chat bot (in XQuery) which gets its data by scraping various web pages, mainly Wikipedia pages so I'm delighted to discover this excellent resource and its SPARQL interface. One subject area is English Football clubs and a brief investigation shows up a few gaps in dbpedia caused by errors in the original Wikipedia pages. For example a count of teams in the Premier league: SELECT count(*) WHERE { ?team skos:subject . ?team dbpedia2:league :Premier_League. } shows only 18 when there are 20. The causes are 1) One team (Wigan) has the league as FA Premier League (the older name) 2) Another team (Manchester City) has the infobox wrongly labeled ( as Football club infobox ) Now 90% accuracy is good but not good enough, particularly if you are a Man City fan! I can fix both these at source and thus fix my chat application but I'd like to switch to dbpedia since I expect it to be faster so I'm wondering what the refresh cycle is or whether there is a repair mechanism in the dbpedia triple store. Regards Chris Wallace Department of Computer Science and Digital Media UWE Bristol This email was independently scanned for viruses by McAfee anti-virus software and none were found DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\" wikipedia errors and dbpedia refresh period uwikipedia errors and dbpedia refresh periodHi Chris, due to the way Wikipedia editing works, there are lots of inconsistencies like the ones that you have discovered in Wikipedia. DBpedia currently takes Wikipedia data as it is and does not try to resolve any inconsistencies. Therefore, you are getting the wrong result for your count query. Resolving inconsistiencies would require lots of manual (or semi-automatic) work and it is an open question whether this work should be done by correcting DBpedia data or, looking more promising to me, whether it would be better to currect inconsisitencies right within Wikipedia. We have lots of ideas for authoring tools that would support for Wikipedia authors by pointing them at inconsistencies and proposing ways to correct them, but we did not do any work into this direction yet as we are still looking for funding. The Dbpedia update cycle is currently 2 month and the next release of the dataset will be published in January. Cheers Chris uThanks Chris That's I thought and as you say, dbpedia makes a great tool for Wikipedians to find these discrepancies and correct them -I'll do what I can to tidy this subject area - do you have date when you will download ? Chris This email was independently scanned for viruses by McAfee anti-virus software and none were found wikipedia errors and dbpedia refresh period uwikipedia errors and dbpedia refresh periodHi Chris, This would be great. Jens and Jörg will run the extraction job against the Wikipedia dump that is current in January. Jens, Jörg: Can you guess the deadline for Chris' edits to get into this dump? Cheers Chris uChris Bizer wrote: The community around Wikipedia is apparently very good at organizing large numbers of people to do things like this. If you can *detect* and report inconsistencies in a semi-automated fashion, that would be a great help. Focus on what you do best. Are you an expert in finding funding? Perhaps Jimmy Wales (or Wikimedia Deutschland e.V.) can do that for you? uHallo Gerard, We have no intension to claim any copyrights on the facts that we extract from Wikipedia nor on the extraction code. If we do this by accident, then this is due to us not looking in enough detail on the liciensing issues. Which licenses should we use, in your opinion, in order to have the data open and not to violate the Wikipedia license at the same time? Any hints appreciated. Chris uHi Lars, yes, we think we can do this. We are currently working on extracting info box data from different editions of Wikipedia. The logical next step is to mine this data for inconsistencies. This would be great, but I don't think that the Wikipedia community has lots of funds for research projects at Universities, when I look at how they collecting donations on the Wikipedia site just to buy servers. It would be great if I'm wrong. Do you know more? Any hints appriciated. Chris" "problem accessing the DBPedia cluster" "uIs there a status page for DBPedia where I can see when there is a problem with the cluster at Thank you, Brian" "sex or dbo:gender or foaf:gender?" "uShould we use dbo:sex or dbo:gender or foaf:gender for a person's gender? - EN dbpedia uses dbo:sex for horses (as a literal) and dbo:gender for schools (as a resource); neither for people This has given little guidance to other DBpedias which to use, e.g. - PT, TR, EL use dbo:gender - BG uses dbo:sex - very few use foaf:gender for example: fr:Infobox Personnage (fiction) Furthermore, the property definitions should be more definitive as to what values are allowed. 1. Says rdfs:range xsd:string, and owl:equivalentProperty wikidata:P21. - This cannot be true since wikidata:P21 is an object property with a fixed set of values: Allowed values: for persons:male (Q6581097), female (Q6581072), intersex (Q1097630), transgender female (Q1052281), transgender male (Q2449503), genderqueer (Q48270), fa'afafine (Q1399232), mâhû (Q3277905), kathoey (Q746411), fakaleiti (Q350374), hijra (Q660882) ; for animals: male animal (Q44148) or female animal (Q43445) 2. Says rdfs:domain owl:Thing, rdfs:range owl:Thing and doesn’t specify the values. Both properties should document what values are allowed. For BG dbpedia we intend to use dbo:gender dbp:Male, dbp:Female" "Wikipedia lists in DBpedia?" "uHello all There are a lot of \"List of \" in Wikipedia, and I thought they were somehow translated in DBpedia, but unless I miss something, I don't see anything equivalent to e.g., This list is pretty much up-to-date, including e.g., the recent changes in Ukraine. The closer equivalent I could find is but I wonder what \"Current\" means here, since Nicolas Sarkozy is still among the instances but not François Hollande (the latter replaced the former in May 2012 for those who missed the event) Even if one does not expect real-time data, this is quite a long delay for updating Thanks for any clue on this. uHi Bernard, concerning the extraction of knowledge from list pages, which is not a straight-forward problem as it may seem, we had a paper last year at the DBpedia&NLP; workshop [1]. As far as up-to-date-ness is concerned: the current build of DBpedia is based on a dump from May 2013. However, mappings to YAGO may still be older, as they are not extracted freshly from Wikipedia when DBpedia is built, but from the YAGO version which is up to date at the time of the extraction [2]. Hope that helps, Heiko [1] [2] Am 04.03.2014 10:52, schrieb Bernard Vatant: uThere are papers on “list” extraction from text and tables… Techniques usually involve: * Natural Language Processing to identify patterns characteristics of lists in unstructured text * Wrapper Induction to extract similar items from consistently designed tables or web pages, across web pages * Large-scale statistical mining based on content redundancy across tables/pages Also, note that DBpedia had some initiative around table/list extraction in the past: See — Nicolas Torzec Yahoo Labs. From: Heiko Paulheim < > Date: Tuesday, March 4, 2014 at 3:22 AM To: Bernard Vatant < > Cc: DBpedia Discussions < > Subject: Re: [Dbpedia-discussion] Wikipedia lists in DBpedia? Hi Bernard, concerning the extraction of knowledge from list pages, which is not a straight-forward problem as it may seem, we had a paper last year at the DBpedia&NLP; workshop [1]. As far as up-to-date-ness is concerned: the current build of DBpedia is based on a dump from May 2013. However, mappings to YAGO may still be older, as they are not extracted freshly from Wikipedia when DBpedia is built, but from the YAGO version which is up to date at the time of the extraction [2]. Hope that helps, Heiko [1] [2] Am 04.03.2014 10:52, schrieb Bernard Vatant: Hello all There are a lot of \"List of \" in Wikipedia, and I thought they were somehow translated in DBpedia, but unless I miss something, I don't see anything equivalent to e.g., This list is pretty much up-to-date, including e.g., the recent changes in Ukraine. The closer equivalent I could find is but I wonder what \"Current\" means here, since Nicolas Sarkozy is still among the instances but not François Hollande (the latter replaced the former in May 2012 for those who missed the event) Even if one does not expect real-time data, this is quite a long delay for updating Thanks for any clue on this." "Semantic GIS site based on dbpedia, freebase, openstreetmaps, etc." "uHello, We just launched a new site at which is based on data from dbpedia and freebase and uses openstreetmaps for mapping. Behind it all is a 'semantic GIS' engine that combines the ability to represent traditional GIS with the ability to make assertions such as \"The Empire State Building is in Manhattan.\" A particularly remarkable feature is that very few of the images are geotagged: we're able to establish the locations of the photographs based on text and other available evidence. We've currently got an internal RDF vocabulary that does a good job of making assertions like \"Picture A is a picture of topic B\" \"Image C is an instance of Picture A\" \"Image C is 473 pixels high\" \"Image C is available at \" We're thinking a lot about how we're going to expose these assertions to the general public: we used SIOC for car pictures, but it doesn't seem to be all that useful for the major use case I'm thinking of: providing end users with the ability to query an endpoint for creative commons images about topic X and get back a list of images [in different sizes] with sufficient metadata to use usefully. Any suggestions or ideas?" "DBpedia (EN) - how many triples?" "uHi All, I am eager to know about the number of triples present in English version of DBPedia at public DBpedia endpoint: Thanks. Hi All, I am eager to know about the number of triples present in English version of DBPedia at public DBpedia endpoint: Thanks. uHi Vishal, On 02/25/2013 05:55 AM, Vishal Sinha wrote: Do you mean you want to get the count of triples of the endpoint? uHi Mohamed and everybody else, Yes, I am curious to know about the count (i.e., number of triples in the current publid DBpedia endpoint - only English version of DBpedia 3.8: Thanks. From: Mohamed Morsey < > To: Vishal Sinha < > Cc: \" \" < > Sent: Monday, February 25, 2013 12:15 PM Subject: Re: [Dbpedia-discussion] DBpedia (EN) - how many triples? Hi Vishal, On 02/25/2013 05:55 AM, Vishal Sinha wrote: Hi All, uHi Vishal, You can use the following links for details Best, Dimtiris On Mon, Feb 25, 2013 at 7:44 PM, Vishal Sinha < >wrote: uHi Vishal, On 02/25/2013 06:44 PM, Vishal Sinha wrote: You can use the following SPARQL query: SELECT COUNT(*) as ?numOfTriples WHERE {?s ?p ?o} Hope it helps. uHi Mohamed, When I ran query you mentioned: SELECT COUNT(*) as ?numOfTriples WHERE {?s ?p ?o} I got the number: 269437410 But on the following queries, I got different answers: SELECT COUNT(?s) as ?numOfTriples WHERE {?s ?p ?o} I got the number: 269437410 SELECT COUNT(?p) as ?numOfTriples WHERE {?s ?p ?o} I got the number: 309794693 SELECT COUNT(?o) as ?numOfTriples WHERE {?s ?p ?o} I got the number:  309794736 Can anybody explain the reasons why different number as an answer on the above SPARQL queries on the same endpoint?  Thanks. From: Mohamed Morsey < > To: Vishal Sinha < > Cc: \" \" < > Sent: Monday, February 25, 2013 11:29 PM Subject: Re: [Dbpedia-discussion] DBpedia (EN) - how many triples? Hi Vishal, On 02/25/2013 06:44 PM, Vishal Sinha wrote: Hi Mohamed and everybody else, SELECT COUNT(*) as ?numOfTriples WHERE {?s ?p ?o} Hope it helps. uThis subject is thoroughly discussed in the \"Fair use policy\" note in [1] Best, Dimitris [1] On Mon, Feb 25, 2013 at 8:22 PM, Vishal Sinha < >wrote:" "DBpedia 3.7 released, including 15 localized Editions" "uHi Chris! This caught my attention; do you know what UMBEL linkage version you used for this release? Is this the one we recently published with umbel 1.0? Thanks, Fred" "Mapping server down" "uHi, Could somebody please fix the mapping server? We would like to start working on building the mappings for the Dutch namespace but we are on hold for a number of weeks now. At the moment we can write the mappings but we can't test them (and the statitics aren't working either). Please let us know when things are really hard to fix, perhaps we can organize some assistance. Thank you in advance! Best regards, Enno Meijers Hi, Could somebody please fix the mapping server? We would like to start working on building the mappings for the Dutch namespace but we are on hold for a number of weeks now. At the moment we can write the mappings but we can't test them (and the statitics aren't working either). Please let us know when things are really hard to fix, perhaps we can organize some assistance. Thank you in advance! Best regards, Enno Meijers uSorry for the inconvenience. We've been movin a few servers. The mappings server should be back tomorrow or the day after. JC On Wed, Oct 10, 2012 at 6:37 PM, Enno Meijers < > wrote: uHi Jona, Thanks for the good news, goodluck with finishing the job! Best, Enno Van: [ ] namens Jona Christopher Sahnwaldt [ ] Verzonden: woensdag 10 oktober 2012 18:55 To: Enno Meijers Cc: < > Onderwerp: Re: [Dbpedia-discussion] Mapping server down Sorry for the inconvenience. We've been movin a few servers. The mappings server should be back tomorrow or the day after. JC On Wed, Oct 10, 2012 at 6:37 PM, Enno Meijers < > wrote: uHi Enno, the server is back now. Please let us know if you encounter any problems. Regards, JC On Wed, Oct 10, 2012 at 7:11 PM, Enno Meijers < > wrote: uHi Jona, Thanks for bringing it back to life! Everything looks fine now (and much quicker than before). Regards, Enno Van: [ ] namens Jona Christopher Sahnwaldt [ ] Verzonden: donderdag 18 oktober 2012 23:05 To: Enno Meijers Cc: < > Onderwerp: Re: [Dbpedia-discussion] Mapping server down Hi Enno, the server is back now. Please let us know if you encounter any problems. Regards, JC On Wed, Oct 10, 2012 at 7:11 PM, Enno Meijers < > wrote: uHi Jona, When changing the home page for the Dutch mappings ( Interne fout Detected bug in an extension! Hook UltrapediaAPI::onArticleUpdateBeforeRedirect failed to return a value; should return true to continue hook processing or false to abort. Backtrace: #0 /var/www/mappings.dbpedia.org/includes/Article.php(1785): wfRunHooks('ArticleUpdateBe', Array) #1 /var/www/mappings.dbpedia.org/includes/EditPage.php(1061): Article->updateArticle('Help mee me', 'verwijzingen na', false, false, false, '') #2 /var/www/mappings.dbpedia.org/includes/EditPage.php(2536): EditPage->internalAttemptSave(false, false) #3 /var/www/mappings.dbpedia.org/includes/EditPage.php(466): EditPage->attemptSave() #4 /var/www/mappings.dbpedia.org/includes/EditPage.php(351): EditPage->edit() #5 /var/www/mappings.dbpedia.org/includes/Wiki.php(541): EditPage->submit() #6 /var/www/mappings.dbpedia.org/includes/Wiki.php(70): MediaWiki->performAction(Object(OutputPage), Object(Article), Object(Title), Object(User), Object(WebRequest)) #7 /var/www/mappings.dbpedia.org/index.php(118): MediaWiki->performRequestForTitle(Object(Title), Object(Article), Object(OutputPage), Object(User), Object(WebRequest)) #8 {main} Despite of this error the changes were saved. Regards, Enno Van: [ ] namens Jona Christopher Sahnwaldt [ ] Verzonden: donderdag 18 oktober 2012 23:05 To: Enno Meijers Cc: < > Onderwerp: Re: [Dbpedia-discussion] Mapping server down Hi Enno, the server is back now. Please let us know if you encounter any problems. Regards, JC On Wed, Oct 10, 2012 at 7:11 PM, Enno Meijers < > wrote: uHi Jona, I'm currently trying to submit new mappings and ontology elements, but I get a blank page after pushing the 'Save page' or 'Validate' buttons in the wiki. Submissions are not saved. Cheers, Marco On 10/18/12 11:05 PM, Jona Christopher Sahnwaldt wrote:" "Too many redirects?" "uHi, I'm trying to set up some museum data, following best practice guidelines, with IIS 6 as the web server. Because IIS 6 doesn't have mod_rewrite support, I have taken the Smart 404 setup ( added support for filtering by Accept type. This all seems to work nicely: my base URLs such as: get 303'ed correctly to and depending on the Accept settings in the page header. So far, so good. The problem is that I then do a \"permanent\" re-direct of the mapped URL to an ASP which will actually deliver the required RTF, e.g.: &file;=WTcoll&index;=Identity%20number&key;=GRMDC.C104.6 When using tools such as the OpenLink RDF Browser, I only get sensible results if I enter this full form of URL: entering the \"resource\" URL doesn't work. Is it the case that I need to engineer my \"resource\" URLs so that they work without any redirection? (If so, they may need to become a bit less elegant ) Thanks, Richard uFurther to this, I have (on my son's advice) added an explicit \"RDF+XML\" Content-Type to the returned resource whenever this is specified in the Accept header in the request. This doesn't seem to help. I have also used 303 (SEEALSO) at each stage, rather than the hard-wired 301 (PERM) which I was using for the final stage. I've now realised that what I am seeing in the OpenLink RDF Browser is an RDF summary (of sorts) of the HTML page which you are sent to when you don't specify Accept: application/rdf+xml. Meanwhile, Disco just says \"No Information to Display\" unless I give it the final fully-converted URL. Which leads me to wonder whether these RDF browsers are setting the Accept header to \"application/rdf+xml\", when submitting an HTTP request for the resource whose URL you have typed in. Richard In message < >, Richard Light < > writes uRichard Light wrote: Richard, Please send the URI of an OpenLink Browser session as a reply to this mail. You can also try: This will aid us in determining what's going on here. So far here is what I see: Pass 1: Kingsleys-Second-PowerBook-G4-130:~ kingsleyidehen$ curl -I -H \"Accept: application/rdf+xml\" HTTP/1.1 303 See also Date: Tue, 10 Jun 2008 15:06:42 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET Location: number&key;=GRMDC.C104.6 Content-Length: 475 Content-Type: application/rdf+xml Set-Cookie: ASPSESSIONIDSAQBTRAQ=NPILDMBDBNBDNMJANPHCKOBH; path=/ Cache-control: private Pass 2: Kingsleys-Second-PowerBook-G4-130:~ kingsleyidehen$ curl -I -H \"Accept: application/rdf+xml\" \" number&key;=GRMDC.C104.6\" HTTP/1.1 400 Bad Request Content-Length: 20 Content-Type: text/html Date: Tue, 10 Jun 2008 15:08:05 GMT Connection: close Note: number&key;=GRMDC.C104.6 Kingsley uI'm sure you don't mean this: Is this something to do with the Session menu? I have saved a session as rbltest in the default WebDav directory I was offered (/DAV/home/demo). This shows the two properties which I get when I ask it to open: As I said, it looks as though it is bombing out at some point in the redirection and giving me the HTML page (though possibly \"type-cast\" as RDF+XML) instead of the RDF. One thing I'm not sure about is whether the \"Accept: application/rdf+xml\" header will persist through a number of SEE ALSO redirects. If not, that wouldn't help This does similar things: it thinks it's a \"Document\". Sorry about this. After writing my last message, I was playing around with the redirection strategy just to see if I could make it behave better. In particular, I was trying a \"single redirection\" strategy to see if it made any difference. (It didn't.) If you try curl again you should get the expected results now. (This is what I get:) Pass 1: C:\Program Files\curl-7.18.2>curl -I -H \"Accept: application/rdf+xml\" lections.wordsworth.org.uk/object/GRMDC.C104.6 HTTP/1.1 303 See also Date: Tue, 10 Jun 2008 15:50:36 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET Location: Content-Length: 357 Content-Type: application/rdf+xml Set-Cookie: ASPSESSIONIDSAQBTRAQ=JCJLDMBDIFDBPPMNHBDJBMMJ; path=/ Cache-control: private Pass 2: C:\Program Files\curl-7.18.2>curl -I -H \"Accept: application/rdf+xml\" \" llections.wordsworth.org.uk/object/resource/GRMDC.C104.6\" HTTP/1.1 303 See also Date: Tue, 10 Jun 2008 15:51:30 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET Location: ect&file;=WTcoll&index;=Identity number&key;=GRMDC.C104.6 Content-Length: 493 Content-Type: application/rdf+xml Set-Cookie: ASPSESSIONIDSAQBTRAQ=KCJLDMBDBFODKMPAJCOBJICL; path=/ Cache-control: private Pass 3(!): C:\Program Files\curl-7.18.2>curl -I -H \"Accept: application/rdf+xml\" \" llections.wordsworth.org.uk/resource.asp?class=object&app;=Object&file;=WTc oll&ind; ex=Identity%20number&key;=GRMDC.C104.6\" HTTP/1.1 200 OK Date: Tue, 10 Jun 2008 15:52:42 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET Content-Length: 1192 Content-Type: application/rdf+xml Expires: Tue, 10 Jun 2008 15:52:42 GMT Set-Cookie: ASPSESSIONIDSAQBTRAQ=LCJLDMBDNBMLPNAFGOKECDAL; path=/ Cache-control: private Richard uRichard Light wrote: I mean: This is becuase you are providing a Document URI. And the document in question doesn't provide a route to any associated RDF resources via their respective URIs (information resource URI or an Entity URI that will return a Description serialized in RDF/XML). We are going to look for RDF data source URIs using the following approaches: 1. Content Negotiation (which at this point gave us nothing beyond the fact that the URI is for a basic document) 2. \" /> 3. GRDDL 4. RDFa Currently here is what we are seeing: kingsleyidehen$ curl -H \"Accept: application/rdf+xml\" \" /object/GRMDC.C104.6; source: /object/*; target: /object/resource/* Object moved Object moved The requested object can currently be found here . Wheras option one would be: kingsleyidehen$ curl -H \"Accept: application/rdf+xml\" \" /object/GRMDC.C104.6; source: /object/*; target: /object/resource/* Object moved /> Object moved The requested object can currently be found here . Anyway, do you have a file containing RDF somewhere? Or a SPARQL Endpoint that can DESCRIBE an URI associated with an RDF Data Set? Curl doesn't currently lead to RDF in either of the aforementioned forms. Kingsley uIn message < >, Kingsley Idehen < > writes worth.org.uk%2Fresource.asp%3Fclass%3Dobject%26app%3DObject%26file%3DWTco ll%26index%3DIdentity%20number%26key%3DGRMDC.C104.8 gives me some RDF. This is the form of URL which I am trying to end up with after my 303's. The RDF is generated dynamically from XML sources. (I fully expect my RDF to be lousy, but I need the browser to show it to me so I can improve it!) something like what you have below. I was naively assuming that if I returned a 303 and set the Location property in the HTTP header, that would be all I needed to do Richard" "How is the paper an interface, gateway, to the web of data?" "uSePublica2012 an ESWC2012 Workshop. May 27-31, Heraklion, Greece. At Sepublica we want to explore the future of scholarly communication and scientific publishing. As we are going through a transition between print media and Web media, Sepublica aims to provide researchers with a venue in which this future can be shaped. Consider research publications: Data sets and code are essential elements of data intensive research, but these are absent when the research is recorded and preserved by way of a scholarly journal article. Or consider news reports: Governments increasingly make public sector information available on the Web, and reporters use it, but news reports very rarely contain fine-grained links to such data sources. At Sepublica we will discuss and present new ways of publishing, sharing, linking, and analyzing such scientific resources as well as reasoning over the data to discover new links and scientific insights. Workshop Format We are planning to have a full day workshop with two main sessions. During the first part of the workshop accepted papers will be presented; the second part of the workshop will address by means of focus groups two main questions, namely “what do we want the future of scholarly communication to be?” and “how could data be preserved and delivered in an interactive manner over scholarly communications?”. These focus groups will be followed by a panel discussion. As an outcome of these activities we will have a communique that will be the editorial for the workshop proceedings, Dates * workshop papers submission deadline: Feb 29 * workshop papers acceptance notification: April 1 * workshop papers camera ready: April 15 *Submission* Research papers are limited to 12 pages and position papers to 5 pages. For system/demo descriptions, a paper of minimum 2 pages, maximum 5 pages should be submitted. Late-breaking news should be one page maximum. All papers and system descriptions should be formatted according to the LNCS format . For submissions that are not in the LNCS PDF format, 400 words count as one page. Submissions that exceed the page limit will be rejected without review. Depending on the number and quality of submissions, authors might be invited to present their papers during a poster session. The author list does not need to be anonymized, as we do not have a double-blind review process in place. Submissions will be peer reviewed by three independent reviewers; late-breaking news get a light review w.r.t. their relevance by two reviewers. Accepted papers have to be presented at the workshop (requires registering for the ESWC conference and the workshop). Issues to be addressed - Representation: - Formal representations of scientific data; ontologies for scientific information - What ontologies do we need for representing structural elements in a document? - How can we capture the semantics of rhetorical structures in scholarly communication, and of hypotheses and scientific evidence? - Integration of quantitative and qualitative scientific information - How could RDF(a) and ontologies be used to represent the knowledge encoded in scientific documents and in general-interest media publications? - Connecting scientific publications with underlying research data sets - Technological Foundations: - Ontology-based visualization of scientific data - - Provenance, quality, privacy and trust of scientific information - Linked Data for dissemination and archiving of research results, for collaboration and research networks, and for research assessment - - How could we realize a paper with an API? How could we have a paper as a database, as a knowledge base? - How is the paper an interface, gateway, to the web of data? How could such and interface be delivered in a contextual manner? Applications and Use Cases: - Case studies on linked science, i.e., astronomy, biology, environmental and socio-economic impacts of global warming, statistics, environmental monitoring, cultural heritage, etc. - Barriers to the acceptance of linked science solutions and strategies to address these - Legal, ethical and economic aspects of Linked Data in science science" "Indonesian DBpedia" "uHi all, Right now, I am developing Indonesian DBpedia. For giving information about my project related to Indonesian DBpedia, here are my detail informations :  Landing page :  One of example resource URL :  SPARQL endpoint :  I have sent pull request for Indonesian configuration [1] and I have mapped 47 infobox template. You can check all of the templates in this URL [2].  For finalizing my project, I want to change my domain into the official domain (id.dbpedia.org). Is there any change for me to complete it? What should I do? Besides that, in developing Indonesian DBpedia, I have some problems and I have not found the solution for them. For giving the information about these problems, you can check my submitted issue in [3]. I need your help for solving this issue.    Thanks for all.  [1]  [2]  [3]  Regards, Riko Hi all, Right now, I am developing Indonesian DBpedia. For giving information about my project related to Indonesian DBpedia, here are my detail informations : Landing page : Riko" "making a query more efficient?" "uSometimes the following query works, and sometimes it times out. Can anyone give me any suggestions about ways to make it more efficient, or at least point out which parts are most inefficient? (Besides the filters, which I assume add a good deal to the required processing.) SELECT ?s, ?artistName, ?album, ?wpURL, ?releaseDate, ?prevAlbum, ?nextAlbum, ?producer, ?coverURL WHERE { ?s dbpedia2:artist ?artistName; dbpedia2:name ?album; foaf:page ?wpURL; dbpedia2:released ?releaseDate; dbpedia2:lastAlbum ?prevAlbum; dbpedia2:nextAlbum ?nextAlbum; dbpedia2:producer ?producer; dbpedia2:cover ?coverURL. FILTER (regex(?artistName, \"Miles\")). FILTER (regex(?album, \"Blue\")). } LIMIT 30 thanks, Bob DuCharme bobdc.blog uHi Bob, Not sure about the namespace for dbpedia2, but try to change the filters for: ?artistName bif:contains \"Miles\". ?album bif:contains \"Blue\". Good luck, Fred u" "where are dbp:ethnicGroup ?" "uWe're looking to make a full list of cultures/ethnic groups/periods/styles from DBP, WD, AAT and BM ethName thesaurus. Such authoritative list would be a valuable contribution to CH data. DBP doesn't have a class for \"X people\" but I found a very useful property: dbp:ethnicGroup E.g. for groups: Shompen Mainland Indians Jarawa Onge Sentinelese Great Andamanese 4 are links (objects) and 3 are mere literals, that's why I use dbp:ethnicGroup and not dbo:ethnicGroup that leaves only objects. On (actually 6 of the 7 since it's derived from this old revision of 17 January 2015: Unfortunately this query: PREFIX dbp: select * {?x dbp:ethnicGroup ?y} returns only 15 results on Where are the rest? uHi Vladimir, I think you overlooked the final \"s\" in the property name, ie dbp:ethnicGroups. Your query thus only returns a few misspellings. The correct query yields 825 results (and 1035 on live) including Andaman, though a large part is not necessarily usable (includes percentages, free text). Cheers. We're looking to make a full list of cultures/ethnic groups/periods/styles from DBP, WD, AAT and BM ethName thesaurus. Such authoritative list would be a valuable contribution to CH data. DBP doesn't have a class for \"X people\" but I found a very useful property: dbp:ethnicGroup E.g. for groups: Shompen Mainland Indians Jarawa Onge Sentinelese Great Andamanese 4 are links (objects) and 3 are mere literals, that's why I use dbp:ethnicGroup and not dbo:ethnicGroup that leaves only objects. On (actually 6 of the 7 since it's derived from this old revision of 17 January 2015: Unfortunately this query: PREFIX dbp: select * {?x dbp:ethnicGroup ?y} returns only 15 results on Where are the rest?" "Disambiguation dataset in 2015-04 dump" "uHello all, I was wondering if it was normal there is still no disambiguation dataset available on the download page of the new DBpedia2015-04 dump? If yes, it has been excluded/deleted for any reason? If no, is it possible to have it somehow? Thanks in advance. uHi Julien, Maybe there is an error in the download page but the datasets exists in the download folders e.g. On Mon, Jan 11, 2016 at 4:16 PM, Julien Plu < > wrote: uThanks guys for your quick answer! I was looking for disambiguation pages for the English version, and as Dimitris said it must be an error in the download page on the wiki, because it doesn't appears: Best regards." "How to query "is of" by SPARQL" "uHi All, I'm new using SPARQL and try to query some information using dbpedia. I got problem when i try to query a property value which started with is of. For Example when i access this page : dbpedia.org/resource/Coffee, it display is dbprop:exportGoods of However, when i try to access the value of this property using sparql it does not give any result. I try the other property without isof and the query can give the results. What's wrong with my query? Thank you for your help. regards, Hendrik Hi All, I'm new using SPARQL and try to query some information using dbpedia. I got problem when i try to query a property value which started with is of. For Example when i access this page : dbpedia.org/resource/Coffee , it display is dbprop:exportGoods of However, when i try to access the value of this property using sparql it does not give any result. I try the other property without isof and the query can give the results. What's wrong with my query? Thank you for your help. regards, Hendrik uOn Sat, Nov 21, 2009 at 4:35 PM, Hendrik < > wrote: The \"is of\" idiom is a way of talking about inverse direction of properties. So if \"danbri has 'age' property '37'\", then also we get \" '37' is age of danbri. \" This can get confusing when the property names already have 'isof' in the name: \"Cat subClassOf Mammal.\" & \"Mammal is subClassOf of Cat\" is a bit confusing. I wish in the original RDF WG we'd used 'superProperty' and 'superClass' instead as the names for those relationships, then we could say 'Cat superClass Mammal. Mammal is superClass of Cat.', which would have worked fine. But back re your query: try reversing the variable names; when it tells you something 'is dbprop:exportGoods of', it means that there is some other object with an exportGoods property whose value is the current thing. Hope this helps, Dan uhi, this problem also occurs with the categories and skos:broader. all concepts are soley connected with broader - there's no such property like narrower - i thought it was done for saving triples, but you have to rewrite certain queries and sometimes formulate like check out consider the following when adjusting your query : - the underlying query of - if you see a isof the on a page the subject of that triple has changed. i.e. resource x broader x skos:broader y x skos:broader z is broader of a skos:broader x b skos:broader x wkr www.turnguard.com" "rdf resource file errors when processing" "uHi, I have encountered some trouble in many files in DBpedia while trying to read them using Jena due to errors in the files (as I guess). Here I list two of the major issues I found out. 1. parsing error QName - in the rdf file for Italy, in line 3812 column 8, there is the following line mentioning \" The Boot; The Belpaese Error in reading model at URI: at com.hp.hpl.jena.rdf.model.impl.RDFDefaultErrorHandler.fatalError(RDFDefaultErrorHandler.java:45) at com.hp.hpl.jena.rdf.arp.impl.ARPSaxErrorHandler.fatalError(ARPSaxErrorHandler.java:35) at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.warning(XMLHandler.java:225) at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.fatalError(XMLHandler.java:255) I found many of these files and got the same error when trying to process these rdf files. One other example is the following file, 2. premature end of file. example for the this resource, ERROR [main] (RDFDefaultErrorHandler.java:44) - Premature end of file. Error in reading model at URI: com.hp.hpl.jena.shared.JenaException: org.xml.sax.SAXParseException; Premature end of file. at com.hp.hpl.jena.rdf.model.impl.RDFDefaultErrorHandler.fatalError(RDFDefaultErrorHandler.java:45) at com.hp.hpl.jena.rdf.arp.impl.ARPSaxErrorHandler.fatalError(ARPSaxErrorHandler.java:35) at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.warning(XMLHandler.java:225) at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.fatalError(XMLHandler.java:255) What can we do for these errors in DBpedia? Is there any solution for these? The major concern for me is the issue 1 which is file is already in good shape except for one QName error. Any suggestions? Thank you. P {margin-top:0;margin-bottom:0;} Hi, I have encountered some trouble in many files in DBpedia while trying to read them using Jena due to errors in the files (as I guess). Here I list two of the major issues I found out. 1. parsing error QName - in the rdf file for Italy, in line 3812 column 8, there is the following line mentioning ' wrote:" "Virtuoso configuration" "uHi, I think I missed something about the Virtuoso configuration, because when I try to dereference any URI like : No data are displayed. I followed this page : And configured like this (from DBpedia FR documentation) : registry_set ('dbp_decode_iri', 'off'); registry_set ('dbp_domain', ' registry_set ('dbp_graph', ' registry_set ('dbp_lang', 'fr'); registry_set ('dbp_DynamicLocal', 'on'); registry_set ('dbp_category', 'Catégorie'); registry_set ('dbp_imprint', ' registry_set ('dbp_website', ' registry_set ('dbp_lhost', ':80'); registry_set ('dbp_vhost', ' Before to install DBpedia and rdf_mapper VADs (that I have compiled in same time than Virtuoso). Someone can tell me what I did wrongly ? Best. Julien. Hi, I think I missed something about the Virtuoso configuration, because when I try to dereference any URI like : Julien." "Sparql Queries and dbpedia:Category:foo" "uHello, can anyone explain to me why the following query does not work as expected: SELECT ?subcat WHERE { ?subcat } Looking at matches. Right now, I don't get any. Regards, Michael uOn 7 Sep 2009, at 15:36, Michael Haas wrote: Looking at that page, I think you want to use skos:broader, not skos:subject? Best, Richard uRichard Cyganiak wrote: You're absolutely right. Quoting the skos spec[0]: \"A triple skos:broader asserts that , the object of the triple, is a broader concept than , the subject of the triple.\" I had that backwards. Thanks a lot! Regards, Michael [0] #semantic-relations" "Unstable answers from DBpedia SPARQL endpoint?" "uDear DBpedians, the results delivered by the DBpedia SPARQL endpoint are apparently not entirely stable. Below, there are two result sets for the following query (no matter what it is supposed to do ;-)), retrieved about an hour apart from each other. SELECT ?x (COUNT(?s) AS ?c0) WHERE { {SELECT ?x WHERE {{ ?y0 ?z0. ?z0 ?y1 ?x} UNION {?x ?y0 ?z0. ?z0 ?y1 }}}?x ?s. ?s} ORDER BY DESC(?c0) LIMIT 300 Are there currently updates going on? Or is that ia bug? Best, Heiko 28.03., 18:46 \"x\",\"c0\" \" \" \" \" \" \" \" \" \"http://dbpedia.org/resource/Mona_Simpson\",6 \"http://dbpedia.org/resource/Lisa_Brennan-Jobs\",6 \"http://dbpedia.org/resource/Laurene_Powell_Jobs\",6 \"http://dbpedia.org/resource/FileMaker_Inc.\",4 \"http://dbpedia.org/resource/Apple_ID\",2 \"http://dbpedia.org/resource/Scosche_Industries\",2 \"http://dbpedia.org/resource/Delicious_Monster\",2 \"http://dbpedia.org/resource/The_Omni_Group\",2 \"http://dbpedia.org/resource/DivX,_Inc.\",2 \"http://dbpedia.org/resource/Barack_Obama\",2 \"http://dbpedia.org/resource/Gold\",1 \"http://dbpedia.org/resource/Los_Angeles\",1 28.03., 19:51 \"x\",\"c0\" \"http://dbpedia.org/resource/Apple_Inc.\",252 \"http://dbpedia.org/resource/Apple_Store\",42 \"http://dbpedia.org/resource/NeXT\",39 \"http://dbpedia.org/resource/Claris\",28 \"http://dbpedia.org/resource/IPhone_4S\",24 \"http://dbpedia.org/resource/IPhone_4\",22 \"http://dbpedia.org/resource/Steve_Jobs\",18 \"http://dbpedia.org/resource/Apple_ID\",10 \"http://dbpedia.org/resource/IPhone_5\",8 \"http://dbpedia.org/resource/FileMaker_Inc.\",8 \"http://dbpedia.org/resource/IPod_Touch_2\",6 \"http://dbpedia.org/resource/Laurene_Powell_Jobs\",6 \"http://dbpedia.org/resource/Mona_Simpson\",6 \"http://dbpedia.org/resource/Lisa_Brennan-Jobs\",6 \"http://dbpedia.org/resource/Siri_%28software%29\",6 \"http://dbpedia.org/resource/MMA_Pro_Fighter\",6 \"http://dbpedia.org/resource/Minecraft\",4 \"http://dbpedia.org/resource/The_Walt_Disney_Company\",4 \"http://dbpedia.org/resource/IPod_Touch_3\",4 \"http://dbpedia.org/resource/Ecoute\",4 \"http://dbpedia.org/resource/Norton_Internet_Security\",4 \"http://dbpedia.org/resource/New_York_City\",4 \"http://dbpedia.org/resource/ICloud\",3 \"http://dbpedia.org/resource/United_States\",3 \"http://dbpedia.org/resource/NeXT_Computer\",2 \"http://dbpedia.org/resource/NeXTcube_Turbo\",2 \"http://dbpedia.org/resource/Delicious_Monster\",2 \"http://dbpedia.org/resource/The_Omni_Group\",2 \"http://dbpedia.org/resource/Scosche_Industries\",2 \"http://dbpedia.org/resource/Los_Angeles\",2 \"http://dbpedia.org/resource/M4V\",1 \"http://dbpedia.org/resource/Gold\",1" "local dbpedia invirtuoso." "uOk. 1) I download ru en fr folder of dbpdeia 3.9. 2) Then i make ld_dir_all('ru', '*.ttl.gz', ' ld_dir_all('en', '*.ttl.gz', ' ld_dir_all('fr', '*.ttl.gz', ' ld_dir_all (wikidata, '*.ttl.gz', ' ld_dir_all (owl, '*.ttl.gz', ' 3) After that. My local dbpedia same with original dbpedia ?? 4) Sparql queries return same results, how in original dbpedia? uHi What are you seeking to do , setup a copy of the DBpedia canonical instance at Details on setting up a specific instance of Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // Weblog" "Inverse properties" "uHi, Why are inverse properties not declared explicitly in the ontology? For example, properties such as \"is dbpedia-owl:location of\" are used extensively in the resource descriptions (e.g. This is causing me some difficulties in an application I am constructing, since I would like to cut and paste selections of these properties into a new format. So I would like all statements to be valid URIs. Any suggestions? thanks, Csaba" "Convert wikipedia pages of a language" "uHi How can I convert Wikipedia page to RDF statements? I speak about create rdf dump from Wikipedia pages of a language that does not have rdf version yet like as Persian language.  Could you suggest me the best tool or technique? Any help would be greatly appreciated! Hi How can I convert Wikipedia page to RDF statements? I speak about create rdf dump from Wikipedia pages of a language that does not have rdf version yet like as Persian language. Could you suggest me the best tool or technique? Any help would be greatly appreciated! uHi Sareh, AFAK, there is a DBpedia dump fro the Persian language, you can use it. On 12/05/2011 09:36 AM, sareh aghaei wrote: If you want to get more recent version of Persian DBpedia, you can do the following: 1- download the DBpedia source code available at [1]. 2- install Scala framework. 3- configure the DBpedia dump module, such that it downloads and processes the dumps of the Persian language only. 4- run the DBpedia dump module through Maven, and it will download the dumps of the Persian language and convert them afterward into RDF. Hope that helps." "Face-to-Face meeting at ISWC" "uFor those of you, who are currently at the ISWC in Boston, maybe it would be interesting to have a face-to-face meeting for the people involved or interested in extraction or enhancement of DBpedia and its language specific versions in order to get in touch and maybe to discuss some current aspects. I would suggest to have a collective table tomorrow for lunch. Commitments and Comments: Regards Magnus uOn 11/13/12 2:09 PM, Knuth, Magnus wrote: OK, this overrides my previous mail :-) Marco uAm 13.11.2012 um 14:20 schrieb Marco Fossati: Sorry, for that race. I updated the poll. Let's choose." "BBC Things" "uHi there, The BBC has recently released bbc.co.uk/things, and 4.5k BBC Things have sameAs links to dbpedia (more to come). We have a mapping of these relations so far; is it possible for these relations to be included in dbpedia, and if so how do we go about doing that? Is this repo the right place to start? Cheers, Amy Hi there, The BBC has recently released bbc.co.uk/things , and 4.5k BBC Things have sameAs links to dbpedia (more to come). We have a mapping of these relations so far; is it possible for these relations to be included in dbpedia, and if so how do we go about doing that? Is this repo the right place to start? Cheers, Amy u0€ *†H†÷  €0€1 0 + uThanks. Just gotta double check some licensing related stuff; should have the all-clear on Monday. Amy On 26 September 2014 18:33, Kingsley Idehen < > wrote: uHi Amy, Great news! The DBpedia links repo should do for this Dimitris On Fri, Sep 26, 2014 at 8:39 PM, Amy G < > wrote: uThanks! Now they're submitted, how long before they're added to the dbpedia store? (Not being impatient, just wondering :) Amy On 29 September 2014 14:52, Dimitris Kontokostas < > wrote: u0€ *†H†÷  €0€1 0 + uAmy, The BBC links are already loaded in DBpedia Live i.e. Kingsley, can you upload them in dbpedia.org as well? Best, Dimitris On Tue, Sep 30, 2014 at 4:04 PM, Kingsley Idehen < > wrote: uAwesome, thanks :D Amy On 2 October 2014 09:11, Dimitris Kontokostas < > wrote: u0€ *†H†÷  €0€1 0 + uThis dataset has been added to dbpedia.org, dbpedia-live.openlinksw.com and lod2.openlinksw.com Example: Patrick" "Issue with properties mapping" "uHi, When I see the validation page for the french mappings I have these validation error : Whereas the properties exists : Anyone has an idea about why this error occur ? Thanks. Julien. Hi, When I see the validation page for the french mappings I have these validation error : Julien. uHi Julien, this should work if you write them with a starting lower case letter, i.e. purchasingPowerParity Cheers Andrea 2013/6/18 Julien Plu < > uAh yes I haven't seen this exception, thank you :-) Best. Julien. 2013/6/18 Andrea Di Menna < >" "alternative name =" "uHello! When trying to get hold of a list of people's name in DBpedia, I stumbled against a couple of names with \"ALTERNATIVE NAME= \" in them, e.g. Not sure where that's coming from, as I can't see it anywhere on the original Wikipedia page? Best, Yves uHi Yves, If you look at the correct revision [1] you'll see that the following infobox definition in missing a '|' and Piñero, Dolores ALTERNATIVE NAMES = is taken as a full string {{Persondata uHi Dimitris, since the wikipedia article has been corrected today, why is the change not visible on Live DBpedia yet? Is the system correctly working? Thanks Andrea 2013/1/9 Dimitris Kontokostas < >: uAh, sorry - Andrea spotted that earlier and I didn't realise he emailed me off-list. I corrected this particular one, but there are quite a few others. I'll see if I can extract a list and fix the corresponding Wikipedia entries. Best, y On Wed, Jan 9, 2013 at 6:49 PM, Dimitris Kontokostas < > wrote:" "How do you fix typos in DBpedia ontology?" "uHi all I stumbled on this property in dbpedia-owl With matching label \"causalties\" Which should be of course \"casualties\", both in URI and label. What is the process regarding such typos? - Do nothing for sake of retro-compatibility? - Change both URI and label? How? When? Will the change be documented? - Keep the URI and change the label only (potentially confusing)? Note that the situation is already confusing because the matching has the correct spelling Best regards Bernard uHi Bernand, in this specific case I would make sure the correct ontology property [1] is used in the mappings where [2] is currently used, that is [3]. After that we should remove [2] as it is mispelled and a correct property already exists. Apart from this specific case, I am not sure whether there exists a general process to handle such cases. @DBpedians? [1] [2] [3] 2014-03-11 15:12 GMT+01:00 Bernard Vatant < >: ugeneral process to handle such cases. I think it is pretty much what you said. On Mar 11, 2014 7:57 AM, \"Andrea Di Menna\" < > wrote:" "DBpedia 3.7 dumps .nt file encoding issues" "uHi, I get a couple of question related to the newest dumps as i published a howto for setting up a local DBpedia mirror quite some time ago on my blog. One I can't answer is related to the encoding of URIs or IRIs in the new dump files: From the DBpedia 3.6 dump: de/labels_de.nt.bz2 and en/labels_en.nt.bz2 From the DBpedia 3.7 (no i18n) 3.7/data/HardDrive2/DBpedia/3.7/data-enUris-compressed/en/labels_en.nt.bz2 (btw. maybe someone could recreate the provided all_languages.tar not to include the absolute path on your server?): From the DBpedia 3.7 (no i18n) 3.7/data/HardDrive2/DBpedia/3.7/data-enUris-compressed/de/labels_de.nt.bz2 From the DBpedia 3.7 (i18n) 3.7-i18n/all_languages-i18n/en/labels_en.nt.bz2 From the DBpedia 3.7 (i18n) 3.7-i18n/all_languages-i18n/de/labels_de.nt.bz2 Are the 3.7 .nt files with non ASCII chars in the URIs valid ntriples files? > The Internet media type / MIME type of N-Triples is text/plain and the character encoding is 7-bit US-ASCII. and there's no ö or ō in ASCII (the .nt files actually seem to be UTF-8 encoded). Aside from finding this inconsistent and inconvenient I ask myself: is it planned to use IRIs where we used URIs before? Another question arising from this: Will DBpedia now use the IRI ? If so how can i request it in sparql? select * where { ?p ?o. } or select * where { ?p ?o. } or select * where { ?p ?o. } cheers, Jörn uHi, On 20. Sep. 2011, at 18:09, Dimitris Kontokostas wrote: Thanks, but the link you provide doesn't mention \"IRI\" once. I agree that it might be useful to have In terms of RDF the IRI Also try requesting That being said i just want to know the reason for introducing this ambiguity, which will cause more requests, more potential for errors, etc. Well, it's not hard to rename all those files and as i'm not the only one who ran into this problem, i'd suggest to rename them on the server side. The el.dbpedia.org SPARQL endpoint seems to be down. Also be aware that there will be quite a lot of people who aren't able to type those requests. I know this shouldn't be a key point, as \"in the near future intelligent agents will construct all those requests for us\", but in the meantime it is a problem. I don't have those keys on my keyboard :-/ Jörn uyou have a point on that :) they are indeed different, but as I said, the English version didn't change at all, it still uses URIs. That is the reason you don't get results from Only 15 multilingual versions use IRI and their own namespace (e.g. de.dbpedia.org). You might don't have the letters in your keyboard, but people who want to write queries in their language don't have to memorize the Unicode mappings to translate them to %-encodingThe same applies for readability, Cheers, Dimitris" "Geographical Coordinates" "uHi all, I am beginning a mapping project that examines the geography of Wikipedia articles. To do this, I would like to start by looking at how many articles are tagged to each country of the world. I see that there geo-data are available ( there anyway to get these data so that I can construct a file organised by lat/long coordinates (that I could then import into a GIS)? Thanks in advance for the assistance, Mark uHi Mark, it shouldn't take much more than a few dozen lines of code to generate a file organised by lat/long (not sure what you mean by that) from the data in the geo_*.nt or geo_*.csv DBpedia files. For example, you could parse such a file and write its data into an SQL database. Or you could load the data into Virtuoso or another RDF store and use SPARQL queries. I'm afraid we don't have the data in any other format that might make your task easier. Christopher On Wed, Nov 4, 2009 at 12:26, Mark Graham < > wrote: uHi Christopher/everyone I managed to parse out the coordinates (lat and long) from the geo_pt.nt files just to try things out. So, it is my understanding that the coordinates represent all geotagged wikipedia articles in portuguese. I then mapped these coordinates and you can view the resultant maps at are just quick and dirty maps in order to error check the data). If you look at the two maps, you will see some very strange patterns (e.g. a heavy focus on France, and no data in South America). Is this simply a reflection of locations that have had a particular focus by wikipedians interested in geotagging? or does anyone have any explanations as to what is going on here? Also, I notice that the geo_pt.nt contains 16706 unique locations. However, I decided to just crosscheck this against the wikipedia data displayed at geonames ( seem to have far more points (65,456). Any thoughts or ideas will be much appreciated. Thanks, Mark uHi, I don't know where the number 65,456 comes from, but here are some thoughts: - the new DBpedia extraction (will be online in a few days) contains geo data extracted from 29,436 Portuguese Wikipedia atricles - about half of geonames' number. - DBpedia only extracts data from non-English articles that have an interwiki link to an English article, so we don't use e.g. but geonames does. - displays map entries for some places that don't have a Portuguese Wikipedia article, for example 65,456 probably includes data from other sources. Probably. has 30,156 entries, most of which seem to have geo data. That could mean half of all 65,456 Portuguese geonames entries are in France, and maybe a similar number for Portuguese DBpedia geo entries. Christopher" "Getting one value from a list values of a predicate" "uHi, I want to get one value from a list values of a predicat. For example in this query, I want to get area of country form DBpedia. Some area are repeated more then once. How to do to get just one value for each country? PREFIX endbp: PREFIX res: SELECT distinct ?area ?countryLabel WHERE{ ?country1 rdf:type dbpedia-owl:Country; dbpedia-owl:areaTotal ?area; rdfs:label ?countryLabel. FILTER( lang(?countryLabel) = \"en\" ) } Thanks Regards Olivier Hi, I want to get one value from a list values of a predicat. For example in this query, I want to get area of country form DBpedia. Some area are repeated more then once. How to do to get just one value for each country? PREFIX endbp:< Olivier uHi Olivier, On 07/01/2013 11:58 PM, Olivier Austina wrote: The point is that some countries have several values for areaTotal e.g. Luxembourg [1]. This because, in the Wikipedia page [2], you can find two values for the area, one is in square kilometer and one is in square mile, which when converted might produce slightly different values. [1] [2] index.php?title=Luxembourg&action;=edit" "dbpedia + disambiguation pages" "uChris, Since your question is quite specific to DBpedia, let's continue the discussion at the DBpedia mailing list (see #support and CC). Please consider remove from the CC list for further replies. On 4 Aug 2007, at 20:18, Chris Richard wrote: No, we currently don't do any special processing for Wikipedia's disambiguation pages. The main focus of DBpedia is extraction of information about the *things* described in Wikipedia articles, to enable domain queries over this information. Disambiguation information isn't really about those things, it's about the names we use to refer to those things. (Specifically, when a single name could refer to more than one of those things.) So it's more linguistic in nature, and hasn't registered prominently on our priority list. I think that a large part of the disambiguation information could be captured using relatively simple heuristics. There's no need to capture everything, 80% might be “good enough”. The DBpedia codebase has pluggable “Extractors” that produce RDF triples from an article's source code; this would be yet another extractor. I don't know how to represent this in RDF. DBpedia defines one resource from each Wikipedia article, assuming that the topic of each article is some meaningful entity in the real world. This certainly doesn't hold for disambiguation pages, whose topic is not a single thing, but a multitude of things that happen to be related to some name, word, or term. Can you give us some examples where you think this information could be used? The next update will include dbpedia:redirectsTo triples for redirected articles. Note that redirects are often not synonyms, but artifacts of Wikipedia's evolution. Redirects contain things like misspelled names, names that adhere to older naming conventions (e.g. the original WikiWords CamelCase naming convention), instances where multiple articles were folded into one etc. Thus they make poor labels. Cheers, Richard uI would like to follow up on a point raised by Chris Richard and discussed by Richard Cyganiak on 2007-08-05; namely - how should Dbpedia deal with disambiguation pages? Recall that Wikipedia uses disambiguation pages to allow users to navigate from ambigous query terms (like en \"bank\") to the appropriate article e.g. concerning financial instutions, river parts, sea beds or whatever. Usually the first link on the disambiguation page is to what is thought to be the most common sense of the term. In Dbpedia disambiguation pages are treated like any other article. For the \"bank\" disambiguation page, the article-label triple is: \"Bank (disambiguation)\"@en . The links to the appropriate articles (senses of \"bank\") in the Wikipedia disambiguation page are available as Dbpedia pagelinks e.g. # financial institution . # sea floor . I am interested in using DBpedia in applications which analyse text and try to identify which concepts are referenced. Suppose I have a sentence which contains the word \"bank\" and that I have identified it as a noun. I want to be able to look up \"bank\" in Dbpedia and find all the things this might mean; later on in the analysis I'll try to eliminate senses of \"bank\" that are inappropriate in the context. This is a classic scenario in language engineering and I am sure I am not alone in considering the use of Dbpedia for this sort of thing. However, the only resource in DBpedia that has the label \"bank\" is the financial institution . To find other resources that can be referred to as \"bank\" I would have to do special processing (e.g somehow transforming \"bank\" into \"Bank (Disambiguation)\" and then following through pagelinks to the different articles). Rather than sorting this out in my own corner, I would much prefer that something be done about this in the Dbpedia deliverables One option might be to follow the disamb page links and add the ambiguous label to each of the referenced articles. So, for example, we would add a label \"Bank\" to the resource already labelled \"Bank (sea floor)\", and so on. This added label should be distinguished in some way e.g. perhaps with skos:altLabel. \"Bank (sea floor)\"@en . \"Bank\"@en . This approach does not tell us which was the first resource mentioned on the disambiguation page. But I believe that in general it is this item which has the amiguous term as its label eg the first item on the bank disambiguation page is the financial institution sense which is labelled \"Bank\". An application using Dbpedia would just follow rdfs:label (rather than skos:altLabel) to find what Wikipedians consider to be the Ur-sense of the term. For existing users of Dbpedia there would be no change: the current labels would continue to work exactly as before. Any solution to disambiguation also needs to be compatible with a solution to redirects (which I believe is on the Dbpedia task list). In the pagelinks, it would be nice to distinguish the links from disambiguation pages from other more innocent wikilinks. Or perhaps explicity flag resources representing disambiguation pages in some way. As you may suspect, I would be pretty happy if Wiktionary was also available in RDF and it had systematic links from senses to Dbpedia resources. But I don't suppose that is going to happen this week Lee Humphreys - SPSS - Paris DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\" dbpedia + disambiguation pages uLee, Thanks for the thoughtful post, and I fully agree with your analysis. Note that finding the links from or to is easy, a special template is used throughout Wikipedia to create these links. Finding all the senses of an ambiguous term requires heuristics, but looks doable. The only thing we need is someone who implements this. Trying to be realistic, I think the core team is not likely to quickly run out of higher-priority tasks, so the best hope of getting this done soon is through an external contribution. Richard On 22 Oct 2007, at 19:57, Humphreys, Robert Lee wrote: uOn 10/22/07, Richard Cyganiak < > wrote: I don't know if this is obvious or not, but not all disambiguation pages are in the style: \"Foo (disambiguation)\". They do all have a template included though {{disambig}}. Sometimes the disambiguation is \"Bank\" if the term is particularly ambiguous :) Judson User:Cohesion" "Text Searching in Virtuoso / 2 questions" "uHi all! This is more for the virtuoso folks but still cc-ing dbpedia. Please give us some input if you can answer these questions: I am trying text search in virtuos from isql against a locally loaded dbpedia. *QUESTION ONE* Why does bif:contains return faster than REGEX search, and why are they returning a different number of counted rows? The search string is not the real string but it does not change the question. Which should we use? (1) searching with with regex SQL> sparql select count(*) where { ?s ?p ?o . FILTER regex(?o , \"searchstring\", \"i\")}; callret-0 INTEGER 4344 1 Rows. uMarvin Lugair wrote: bif:contains uses Virtuoso's in-built Full Text Index engine. The indexes are based on the \"Literal Objects\" in Triples. Regex isn't based on the Full Text indexes. uHello Marvin, bif:contains return faster because it uses special full-text index to get IDs of objects that contain words mentioned in the query, it do not scan the whole table like Regex-based query. The advantage of REGEX is flexibility: one may search for specific fragments of words or for special data like protein coding sequences. Moreover, bif:contains may be used only for variables that are directly bound in object position of triple, not for values of expressions of any other sorts. Because bif:contains looks for phrases or independent words, and it may normalize words that use non-canonical Unicode chars, and it can search in XML/HTML documents. In addition, even if one and the same query string is valid for both REGEX and bif:contains then the meaning may differ. For REGEX, pattern \"Paris Hilton\" is precisely two words delimited by single whitespace byte. For bif:contains, \"Paris Hilton\" means that the document should contain word \"Paris\" and word \"Hilton\", in any places and in any order. See for details of bif:contains query string syntax. question. Which should we use? I'd strongly recommend to report real details as soon as the question is about real problem on a real system uIvan and Kingsley, Thanks for your responses! Things are much clearer now, If I want to search URI's in the property field I should use regex. Marv. uNo Worries! Thanks!" "ImageExtractor config" "uHi all, I am trying to undestand the configuration of the ImageExtractor. The extractor parses first the commons dump to look for non-free licence terms in the images description pages. These terms are given in the ImageExtractorConfig file. However I don't understand why is there a different list of terms for each language. As I understand there is only one wikimedia commons version with a different interface for each language. The source pages are supposed to be in english right ? Best, Julien uHi Julien, Licence terms are in fact templates and are defined serarately for each language. This is the reason the English Wikipedia has only one template for non-free licence ({{non-free}}) while other languages like Greek do not. In Greek we have a seratate template for every licence, so we had to manually find which ones are \"non-free\". I guess you 'll have to do something similar for French too Cheers, Dimitris On Tue, Jan 24, 2012 at 4:05 PM, Julien Cojan < > wrote: uThanks for your reply Dimitris, However I don't get about the implications on the usability in different countries. Does this means that some English wikipedia pages could be breaking the law in France ? And what about languages spoken in different countries ? Do DBpedia dumps for English pages only abide by the US regulation or does it take the most restrictive over all English speaking countries ? Cheers, Julien On 01/24/2012 03:28 PM, Dimitris Kontokostas wrote: uYou have a point there, but I didn't go that deep when I customized the extractor :) Although we only keep images with open licences, a \"Wikipedia guy\" would be more appropriate to answer for all the legal implications. What I did was just to look for patterns in image usage in the Greek Wikipedia and nothing else. Tested my results, they seemed OK and I thought that this was the way for other Wikipedia/DBpedia editions as well. We are always open for corrections / improvements;) Cheers, Dimitris On Tue, Jan 24, 2012 at 6:20 PM, Julien Cojan < > wrote: uOn 24 January 2012 16:20, Julien Cojan < > wrote: In short, no, but IANAL, TINLA, etc. (i.e., if you need a definitive answer, consult a lawyer). Also, I'm glossing over a lot of the details. The images (and other media: video, audio) are separate files: they are not part of the database dumps. All of the text in each wikipedia can be reasonably expected to be free, and can be considered to be a single work. Each image (etc.) is an individual work, and can carry its own licensing terms. Usually, these are similar in spirit to cc-by-sa, or less restrictive. Some wikipedias will tolerate (i.e., they are discouraged, but permitted) non-free images (and some don't - the Polish Wikipedia does not allow any non-free images). In the case of the English Wikipedia, most of these are low-resolution reproductions of 'important' images, and are flagged as being used under the 'fair use' doctrine. A complication here is public domain images, because what is and isn't in the public domain varies greatly from territory to territory. Something published before 1923 will typically be in the public domain in the US, but may not be anywhere else. The Wikimedia Foundation is based in the US, as are the Wikipedia servers, so to a greater or lesser extent, US law applies to all wikipedias (there may be local wikimedia foundations that correspond to a particular language's edition of wikipedia, which adopt a set of rules more appropriate to their location, but the servers are still in the US). uis that the reason for all broken image links? for example dbpedia:SomeResource foaf:depiction . most values for foaf:depiction are different from the real values at wikipediathe differences appear at commons/langtag and N assuming other numeric valuesdbpedia values for depiction are returning 404 Cheers, Mauricio On 1/24/12, Dimitris Kontokostas < > wrote: uOn 01/24/2012 07:23 PM, Jimmy O'Regan wrote: Does it mean that a link towards a copyrighted data is not copyrighted ? In this case would it be possible to link dbpedia instances to any image from Wikimedia Commons provided that the license terms are also specified ? As I see the copyrights are indicated by Templates in Wikimedia Commons pages. Would it be possible to use mappings to related this templates to license terms in RDF ? Cheers, Julien uHi Mauricio, There was a recent bug update in the framework ([1]) You can check in DBpedia Live if the issue is fixed, otherwise you can report back with some sample error pages Cheers, Dimitris [1] On Tue, Jan 24, 2012 at 8:35 PM, Mauricio < > wrote: uOn 27 January 2012 17:02, Julien Cojan < > wrote: IANAL, TINLA, consult a lawyer, etc., but 'probably not'. A link is 'mere fact' under US law, and not eligible for copyright protection. If all you're talking about is providing a set of triples that include links to images on wikimedia commons, then there should be absolutely no problem. Yes. There would be some effort involved, as there are quite a number of templates just for licence, but it's definitely possible. If you've been following other threads on this mailing list, you may have seen the recent Ookaboo announcement, which may be what you're looking for. uHi Mauricio , On 01/24/2012 07:35 PM, Mauricio wrote: This issue is fixed in DBpedia-Live available at \" You can check it and send us your feedback about it. uHi All, I want to add ImageExtractor config for id, but I don't understand about NonFreeRegex. I tried to add  \"id\" -> \"\"\"(?i)\{\{\s?(Copyright by Wikimedia)\s?\}\}\"\"\".r, and it's work.  Can you explain to me about NonFreeRegex[1] ? [1]  Regards, Riko Hi All, I want to add ImageExtractor config for id, but I don't understand about NonFreeRegex. I tried to add \"id\" -> \"\"\"(?i)\{\{\s?(Copyright by Wikimedia)\s?\}\}\"\"\".r, and it's work. Can you explain to me about NonFreeRegex[1] ? [1] Riko uHi Riko, I added some comments in the config file. Let us know if you still have issues. Best, Dimitris On Wed, Apr 10, 2013 at 12:19 PM, Riko Adi Prasetya <" "Extraction problems while parsing wikipedias.csv" "uHi all, I am a GSoC student working on the List Extractor and I am trying to use the extraction framework, but unfortunately I am facing many problems. I have already read the documentation and many discussions about it but I couldn't find a solution. I followed the instructions in and proceeded with downloading the dumps, I was successful for as regards the download config files 'download.minimal.properties' and 'download.wikidata.properties', but when I try to use 'download.10000.properties' I get this error: [INFO] launcher 'download' selected => org.dbpedia.extraction.dump.download.Download downloading ' to 'C:\Users\Federica\Desktop\extraction\wikipedias.csv' read 48.77539 KB of ? B in 0.015 seconds (3.1754808 MB/s) parsing C:\Users\Federica\Desktop\extraction\wikipedias.csv java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at scala_maven_executions.MainHelper.runMain(MainHelper.java:164) at scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26) Caused by: java.lang.IllegalArgumentException: unknown language code min at org.dbpedia.extraction.util.Language$$anonfun$apply$1.apply(Language.scala:185) at org.dbpedia.extraction.util.Language$$anonfun$apply$1.apply(Language.scala:185) at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) at scala.collection.AbstractMap.getOrElse(Map.scala:59) at org.dbpedia.extraction.util.Language$.apply(Language.scala:185) at org.dbpedia.extraction.util.WikiInfo$.fromLine(WikiInfo.scala:69) at org.dbpedia.extraction.util.WikiInfo$$anonfun$fromLines$1.apply(WikiInfo.scala:50) at org.dbpedia.extraction.util.WikiInfo$$anonfun$fromLines$1.apply(WikiInfo.scala:50) at scala.collection.Iterator$class.foreach(Iterator.scala:743) at scala.collection.AbstractIterator.foreach(Iterator.scala:1195) at org.dbpedia.extraction.util.WikiInfo$.fromLines(WikiInfo.scala:50) at org.dbpedia.extraction.util.WikiInfo$.fromSource(WikiInfo.scala:37) at org.dbpedia.extraction.util.WikiInfo$.fromFile(WikiInfo.scala:28) at org.dbpedia.extraction.dump.download.Download$.main(Download.scala:52) at org.dbpedia.extraction.dump.download.Download.main(Download.scala) 6 more [INFO] uHi there, I regenerated the settings files. Please pull from the master and try again. I will have a closer look at this root problem in the coming days so this won't happen again. Best luck with the GSoC. On Tue, May 17, 2016 at 2:58 PM, Federica Baiocchi < > wrote:" "MusicBrainz DBpedia Links" "uApologies that I've been slow to follow up on my offer (I had a change of jobs in between). The MusicBrainz RDF dumps via R2RML are now official: via These include DBpedia links produced via the following mappings: Producing: Now, I'm considering how to add these to the project: I notice there that different languages' DBpedias are separated, whereas I'd dealt with them all in one mapping file. Should I refactor these, or can they be accepted in one piece? Barry Apologies that I've been slow to follow up on my offer (I had a change of jobs in between). The MusicBrainz RDF dumps via R2RML are now official: Barry" "how to get most recent data from dbpedia" "uHello I am working on dbpedia based quiz game where I will have questions like who is the captain of German Footbal team? Now the captains keeps changes every time I will have to make changes in my code? Similarly there can be other questions too so what is the possible solution? Regards Hello I am working on dbpedia based quiz game where I will have questions like who is the captain of German Footbal team? Now the captains keeps changes every time I will have to make changes in my code? Similarly there can be other questions too so what is the possible solution? Regards" "DBpedia categories; URL shortening" "uHi Kingsley! The problem is: 1. How to KNOW which categories represent types. There are 5-10 approaches for this, based on NLP and ML. E.g. Yago2 assumes that if the head-word of the cat name is a plural noun, that's a class. MENTA improves this by looking for a head-word that's a countable noun. A lot of them tie up into Wordnet, some into OpenCyc/UMBEL. It's a hard research problem. 2. How to KNOW which category-entity instance is an exception. E.g. a category \"Books of Author Xyz\" is typically applied to the page \"Author Xyz\" and a naïve interpretation will conclude that the author is a book. In wikipedia, Categories are a navigational aid, i.e. \"mere\" links. The trick is how to find those links that are type links. uOn 1/10/15 1:54 AM, Vladimir Alexiev wrote:" "DBpedia as Tables 2016-04 release" "uDear all, We are happy to announce the new release of the DBpedia as Tables tool [1]. As some of the potential users of DBpedia might not be familiar with the RDF data model and the SPARQL query language, with this tool we provide some of the core DBpedia 2016-04 data in tabular form as CSV and JSON files, which can easily be processed using standard tools, such as spreadsheet applications, relational databases or data mining tools. For each class in the DBpedia ontology (such as Person, Radio Station, Ice Hockey Player, or Band) we provide a single CSV/JSON file which contains all instances of this class. Each instance is described by its URI, an English label and a short abstract, the mapping-based infobox data describing the instance (extracted from the English edition of Wikipedia), and geo-coordinates (if applicable). Altogether we provide 463 CSV/JSON files in the form of a single ZIP file that describe data we extract from English Wikipedia [2] and Wikimedia Commons [3] More information about the file format as well as the download link can be found on the DBpedia as Tables Wiki page [1]. Any feedback is welcome! Best regards, Petar, Dimitris and Chris [1] [2] ediaClasses.htm [3] 4/" "Advancing the DBpedia ontology" "uDear all, We are in the process of reorganizing the DBpedia ontology. Anyone who is interested to contribute to the future directions of the project is welcome to join. The plan is as follows: 1) We have a dedicated session in the next DBpedia meeting in Dublin (Feb 9th) [1] where we will discuss the editing workflow, future directions and the formation of the DBpedia ontology committee [2]. 2) The committee will be responsible to set the future plans & rules that will be announced shortly after the meeting. Best regards, Dimtiris [1] wiki.dbpedia.org/meetings/Dublin2015 [2] DBpedia_Ontology_Committee uHi Dimitris et al., A) What is the specific use you have in mind? B) Are you thinking about a centralized ontology managed by editors, a user-contributed ontology, or an automatically generated taxonomy? C) How will it relate to other ontologies, taxonomies and schemas? Also, will it relate to Wikidata, Wikipedia, schema.org, Facebook OG, etc. D) How will you categorize Wiki pages (and possibly other documents) against this ontology? Cheers.-N. u uHi Nicolas, On Fri, Jan 23, 2015 at 10:23 PM, Nicolas Torzec < > wrote: The primary target is to clean up the ontology from duplicated properties/classes and set correct type hierarchies. Besides that we have may ideas in mind to enrich and align the ontology DBpedia was a completely user-driven ontology and we plan to keep it that way. Now we will only set some editing workflow rules that will ensure a basic level of quality. We also plan to change the editing infrastructure and move to WebProtege that will enable easier importing of automatically generated axioms. Further alignment to other ontologies is also under strong consideration. This change will not affect the core extraction framework but will facilitate the integration of recent work on A-BOX assertions with NLP techniques. uHi Peter, ATM I can only answer for disjointness axioms. We plan to use them for cleaning up extracted data so we definitely want them. For the rest, we are open to suggestions and this is one of the reasons we invite ontology experts to participate. Best, Dimitris On Sat, Jan 24, 2015 at 5:47 AM, Peter F. Patel-Schneider < > wrote: uHi Dimitris and Antonino, A couple of questions: When you mentioned interlinking methods, are you talking about instances matching or schema matching? I am very interested in methods for instances deduplication in a triplestore. Any reference? How do you keep a version control of the ontology changes with WebProtege? Does it work fine with many contributors editing the ontology simultaneously? How does DBPedia ontology evolved over time? These are all relevant questions for other community-driven ontologies like VIVO ( Best, Alexandre Rademaker uOn Sat, Jan 24, 2015 at 12:58 PM, Antonino Lo Bue < > wrote: Hi Antonino, Yes, the DBpedia extraction framework is always aligned with the DBpedia ontology. Dimitris uHi Alexandre, On Sat, Jan 24, 2015 at 2:08 PM, Alexandre Rademaker < > wrote: In this context we refer only on schema matching. At the moment, the DBpedia ontology is collaboratively edited in a custom mediawiki (mappings.dbpedia.org). If/When we move to WebProtege, the wiki will probably be in sync but only in read mode and as far as I can tell, WebProtege keeps versioning. uHi Michael, On Sat, Jan 24, 2015 at 2:14 PM, Michael Brunnbauer < > wrote: Really? The DBpedia ontology has been evolving since it started so we see this like operations-as-usual, but in a more structured way. If there is any change in the core top-level classes we will announce it in advance. Best, Dimitris uI would like to help. I cannot travel much, but I do have access to a shared memory supercomputer and a large Hadoop appliance if anyone is interested in applying some new techniques for modeling the ontology. I work for Cray. Aaron u uOn Sat, Jan 24, 2015 at 6:49 PM, Peter F. Patel-Schneider < > wrote: uI've enlarged the goals as follows: - set the future directions of the DBpedia ontology - set best practices for mapping - engage the community in meaningful discussions, eg see - formulate and execute focused investigations that lead to best practices, eg What's in a Name (this page lists 68!! \"name\" properties we currntly got), how to map Parent Places, etc - improve the ontology and mapping editing workflow Of course, each of these goals is up for discussion. But I strongly feel that working on the ontology in isolation from the mappings will not be productive. IMHO the major problems are not with the ontology itself, but more in the mappings. I've shared many weird and scary things, most are on E.g. how many defects can you find here? No peeking in the Discussion tab :-) Another quick quiz: - what is vicePresident in DBO? What should it be a subproperty of? - why the VicePresident class should be deleted? - what's wrong with this mapping: MAB> shared memory supercomputer and a large Hadoop appliance if anyone is interested Oh wow! I think what we need is a bunch of editors who know a bit about RDF, think clearly, and can spend time on editorial discussions and gardening. Supercomputer powers won't help here (but superhuman powers might :-) DK> DBpedia was a completely user-driven ontology and we plan to keep it that way. And hopefully educate the editors through discussions and gardening. Thus far there's been very little discussion on the mapping wiki, which is the major problem In fact this thread proves it: why aren't we discussing the goals of that committee on the wiki ? index.php?title=Talk:DBpedia_Ontology_Committee&action;=edit&redlink;=1 uVladimir has a good point and is one of the very active contributors lately. Of course we do not plan to keep the mappings in isolation from the ontology, there should always be a feedback loop. The problem we had so far was that the mappings drove the ontology design and if someone couldn't easily find a dbo property/class to map an infobox she created a new one. What we want now is the exact opposite, the ontology design should come first. What we did not yet announce is that we are already building some tools that will ease the feedback from the ontology to the mappings wiki and hopefully overall improve the data quality of DBpedia. On Sun, Jan 25, 2015 at 8:41 PM, Vladimir Alexiev < > wrote: u+1 on which expressive power for the ontology. It's important for inferences/validations.   -N. From: Peter F. Patel-Schneider < > To: Dimitris Kontokostas < > Cc: Nicolas Torzec < >; Linked Data community < >; \" \" < >; \" \" < > Sent: Saturday, January 24, 2015 8:49 AM Subject: Re: [Dbpedia-discussion] Advancing the DBpedia ontology uNot very often, everyone praises Wikipedia URLs as persistent - Yes, there's prop dbo:wikiPageRedirects. - dbpedia.org doesn't let you land on such page, but redirects you - nevertheless, there's a useful label there that can be copied to the target nodel - not perfect (e.g. \"god doesn't play with dice\" redirects to Einstein but is not a name of his), but many people use it uDear Dimitris, I'm happy to hear that more work will be invested into a reorganization of the DBpedia ontology. As you might know, together with Magnus we have already invested some thoughts (and publications) into the topic with the focus on data cleansing based on an improved DBpedia ontology.[1,2] Unfortunately, Magnus and I will not be able to participate live at the Dublin Meeting in Feb 9. Nevertheless, we would like to contribute. From our perspective we would like to apply the DBpedia ontology to detect inconsistencies and flaws in DBpedia facts. This should not only be possible in a retroactive way, but should take place much earlier. Besides the detection of inconsistencies during the mapping process or afterwards in the extracted data, this could already be possible right from the start when the user is changing the wikipedia infobox content (in the sense of type checking for domain/range, checking of class disjointness and further constraints, plausibility check for dates in connection with basic axioms to be defined, etc.). Another possibility would be a tool that makes inconsistencies/flaws in wikipedia data visible directly in the wikipedia interface, where users could either correct them or confirm facts that are originally in doubt. To achieve this, not only a formally sound and semantically enrichedDBpedia ontology including a set of basic axioms would be necessary, but also the applications and infrastructure that make use of the ontology. Also the relation of the DBpedia ontology to other ontologies would be a rather interesting topic. This includes the already proposed schemata (schema.org, facebook OG, etc) as well as established ontologies (yago, umbel, etc) where mapping to DBpedia entities already exist. Can we make use of these ontologies (and existing mappings) to complement DBpedia ontology in som (semi-)automated way? Thanks and best regards, Harald [1] G. Töpper, M. Knuth, and H. Sack: DBpedia ontology enrichment for inconsistency detection. i-SEMANTICS 2012 [2] J. Waitelonis, N. Ludwig, M. Knuth, H. Sack: Whoknows? - Evaluating Linked Data Heuristics with a Quiz that cleans up DBpedia. ITSE, vol.8, 2011 (3) uHi all, I am currently working with Aldo Gangemi on exploiting the mappings to DOLCE (and the high level disjointness axioms in DOLCE) for finding modeling issues both in the instances and the ontology. I will not be able to travel to the meeting either, but of course we will share our findings once we're finished, and probably also ask for some input and feedback. Cheers, Heiko Am 03.02.2015 um 09:39 schrieb Harald Sack: uThanks for the hint Hugh, Actually this was quite easy to implement, we just needed a new mapping sample extraction: It would be great if someone could pick up the rest of the related templates found here Any other ideas for templates we do not currently extract is of course more than welcome These will be included in the next DBpedia release Cheers, Dimitris On Tue, Feb 3, 2015 at 12:18 AM, Hugh Glaser < > wrote: uHi all @Heiko, Any feedback is of course more than welcome, btw we had some offline discussion with Aldo who already joined. @Harals & Magnus, you existing work & ideas can greatly contribute to this effort and since you already joined as well, welcome to the DBpedia ontology committee @all We will try to stream this session but I cannot guarantee the result. We will send details later this week for details We will also accept some remote skype presentations but will try to keep them short. Please email your ideas to Best, Dimitris On Tue, Feb 3, 2015 at 10:44 AM, Heiko Paulheim < > wrote: uIndeed Dimitris. I basically told the same as Heiko was mentioning. @Heiko I started giving feedback to the discussion taking place at Aldo uDear all At first thank you all for the great feedback and interest We will try to stream the DBpedia ontology session from the following link For those interested in continuing the discussion by mail we can use the dedicated mailing list: Or you can continue contributing on our document Will also accept a few remote presentations that you can suggest in out mailing list Best regards, Dimitris On Wed, Feb 4, 2015 at 1:54 PM, Aldo Gangemi < > wrote: uHi everyone! My presentations from the Dublin meeting are at - An example of adding a mapping, while making a couple props along the way and reporting a couple of problems. - Provides a wider perspective that data problems are not only due to the ontology, but many other areas. 3. Mapping Language Issues 4. Mapping Server Deficiencies 5. Mapping Wiki Deficiencies 6. Mapping Issues 7. Extraction Framework Issues 8. External Mapping Problems 9. Ontology Problems Almost all of these are also reported in the two trackers described in sec.2 Sounds very interesting! I've been quite active in the last couple of months, but I've been pecking at random here and there. More systematic approaches are definitely needed, as soon as they are not limited to a theoretical experiment, or a one-time effort that's quickly closed down. I've observed many error patterns, and if people smarter than me can devise ways to leverage and amplify these observations using algorithmic or ML approaches, that could create fast progress. I give some examples of the \"Need for Research\" of specific problems: and next section. Sounds very promising! If I can help somehow with \"manual ontological & wiki labor\", let me know. Data vs ontology validation can provide - mapping defect lists - useful hints that the Extraction can use. The most important feature would be Use Domain & Range to Guide Extraction I'm doubtful of the utility of \"error lists\" to Wikipedia (or it needs to be done with skill and tact): 1. The mapping wiki adopts an Object vs DataProp Dichotomy (uses owl:ObjectProperty and owl:DatatypeProperty and never rdf:Property). But MANY Wikipedia fields include both links and text, and in many cases BOTH are useful 2. At the end of the day, Wikipedia is highly-crafted text, so telling Wikipedia editors that they can't write something, will not sit well with them. For example, who should resolve this contradiction: DBO: dbo:parent rdfs:range dbo:Person Wikipedia: | mother = [[Queen Victoria]] of [[England]] I think the Extraction Framework (by filtering out the link that is not Person), not Wikipedians But Wikipedia is moving towards using Wikidata props in template fields: through {{#property}}. Cheers! uVladimir, I am more than happy to work the ML problem with you. For your example of the dichotomy with the domain and range of \"mother\" and queen Victoria being the \"mother\", this begs for contextual approach to that concept Aaron u uI could ask you a converse question: how can you make an accurate ontology without looking at the data? And to look at the data, you need mappings (if not to execute then to document what you've examined). But more constructively: There is a large number of mapping problems independent of the ontology. E.g. when a Singer (Person) is mapped to Band (Organisation) due to wrong check of a field \"background\", I don’t care how the classes are organized, I already hurt that the direct type is wrong. Of course, having a good ontology would help! E.g. some guy named Admin made in 2010 two props \"occupation\" and \"personFunction\" with nearly identical role & history. - No documentation of course. - occupation has 100-250 uses, personFunction has 20-50 uses. - Which of the two to use? - More importantly, which have already been used right, and which are wrong? I suspect that most uses of occupation are as a DataProp, even though it's declared as an ObjectProp. DBpedia adopts an Object/DataProp Dichotomy that IMHO does not work well. See dbpedia-problems-long.html#sec-3-2 uHi Aaron! Would be great to work with someone from Cray but I don't have a good idea how to use ML here, nor indeed a lot of trust in using ML to produce or fix mappings. E.g. see this exchange: Generating 30% wrong prop maps for the Ukrainian dbpedia is IMHO doing them a disservice! Who's gonna clean up all this? I guess I'm more of a MLab (Manual Labor) guy, I just learned they coined such alias for crowdsourcing: She IS the mother, not sure what you mean. Here a simple post-extraction cleanup can take care of it: remove all statements that violate range (so dbo:parent [[England]] will be removed). But we dare not do it, because many of the ranges are imprecise, or set wishfully without regard to existing data / mappings. (As usual, the real data is more complex than any model of it.) So we need to check our Ontological Assumptions and precise domains/ranges before such cleanup. See example in dbpedia-problems-long.html#sec-6-7 uVladimir, I'm thinking of trying to do some stats on the existing ontology and the mappings to see where there is room for improvement. I'm tied up this week with a couple deadlines that I seem to moving towards at greater than light speed, though my progress is not. As soon as I get the rough cut done, I'll share the results with you and maybe we can discuss paths forward? I'm with you on the 30% error ratethat doesn't help anyone. Aaron On Feb 25, 2015, at 08:02, Vladimir Alexiev < > wrote: uJohn, You make a good pointbut are we talking about a complete tear-down of the existing ontology? I'm not necessarily opposed to that notion, by want to make sure that we are all in agreement as to the scope of work, as it were. What would be the implications of a complete redo? Would the benefit outweigh the impact to the community? I would assume that there would be a ripple effect across all other LOD datasets that map to dbpedia, correct? Or am I grossly overstating/misunderstanding how interconnected the ontology is? Vladimir, your thoughts? Aaron uHi John, My thoughts are for DBpedia to stay close to the mission of extracting quality data from Wikipedia, and no more. That quality extraction is an essential grease to the linked data ecosystem, and of much major benefit to anyone needful of broadly useful structured data. I think both Wikipedia and DBpedia have shown that crowdsourced entity information and data works beautifully, but the ontologies or knowledge graphs (category structures) that emerge from these effort are mush. DBpedia, or schema.org from that standpoint, should not be concerned so much about coherent schema, computable knowledge graphs, ontological defensibility, or any such T-Box considerations. They have demonstrably shown themselves to not be strong in these suits. No one hears the term \"folksonomy\" any more because all initial admirers have seen no crowd-sourced schema to really work (from dmoz to Freebase). A schema is not something to be universally consented, but a framework by which to understand a given domain. Yet the conundrum is, to organize anything globally, some form of conceptual agreement about a top-level schema is required. Look to what DBpedia now does strongly: extract vetted structured data from Wikipedia for broader consumption on the Web of data. My counsel is to not let DBpedia's mission stray into questions of conceptual \"truth\". Keep the ontology flat and simple with no aspirations other than \"just the facts, ma'am\". Thanks, Mike On 2/25/2015 10:33 PM, M. Aaron Bossert wrote: uThe one thing I would say is that while I agree in generalthe one thing that keeps eating away at me is that there is tremendous potential in dbpedia for bigger questions to be answered, but the more advanced analytics require that some level of sanity exists within the ontologymuch more so than now. As an example, I have created several different applications for customers that are based on dbpediaone of which is a recommender system. The level of effort required to simply say (in SPARQL, of course) \"show me every living person that is highly similar to person X, excluding politicians athletes and actors\" is quite a tedious thing to do until after I have \"fixed\" all the erroneous and missing properties associated with \"things\" in generalwhich person class do I focus on? Which living people? Which politicians? Perhaps legislators? It gets pretty ugly, pretty quickly. I'm not sure that the ontology needs to be completely rewritten, but surely it can't be that difficult to clean up a bit with a little common sense logic applied such as if a \"thing\" has a death date (never mind which one), then surely they are not a living personor if they hold a political office, surely they must be a politician. Aaron uI agree with Aaron and this the reason we started this effort. Even a small improvement in quality through the ontology could have a big impact. Improving the ontology should be an iterative process that will take into account both the data and the mappings. However, some decisions/actions can be made independent of the data or the mappings @John Flynn, all the issues you mention are valid and we are working on a more formal description of the requirements that will cover at least some of them @Mike Bergman, I think everyone agrees that strict schemas do not work well in crouwdsourced data and we need to define some trade-offs @all, looks like this thread got too big and too focused on DBpedia. I suggest we continue the discussion on the dbpedia-discussion / dbpedia-ontology mailing lists Best, Dimitris On Thu, Feb 26, 2015 at 8:07 AM, M. Aaron Bossert < > wrote: u" "Mappings server down?" "uHi, Is the DBpedia Mappings server at down? I've noticed in the past it will go down for days at a time. I'm curious what is the expected uptime of that server, or if it is even meant for public use. I use the Mappings server sometimes to download the latest version of the DBpedia ontology. Is there a more reliable way to accomplish this if the remote server is down, for example by running my own Mappings server locally? Thanks, David Hi, Is the DBpedia Mappings server at David uOn Wed, Oct 10, 2012 at 5:17 AM, David Butler < > wrote: It is definitely meant for public use. We don't have a target uptime though. We've been moving a few servers lately and didn't get to the mappings server yet. We'll fix it tomorrow or the day after. If script download-ontology: Check out DBpedia from Mercurial, dump Branch, cd to core/, and execute /run download-ontology if you're running bash, mvn scala:run -Dlauncher=download-ontology otherwise. This will download the ontology in OWL and MediaWiki format. Hope this helps a bit. JC" "Dump-based Extraction error" "uI'm trying to follow the instructions on  'Could not transfer metadata org.sweble.wikitext:swc-engine:1.1.1-SNAPSHOT/maven-metadata.xml from/to osr-public-releases ( I certainly can't see 1.1.1-SNAPSHOT/maven-metadata.xml. Is that likely to be what's causing the fail? (I'm trying to do this using the Command Line in Windows Vista). I'm a newbie at this so would appreciate your patience and advice. Hywel Jones I'm trying to follow the instructions on trying to do this using the Command Line in Windows Vista). I'm a newbie at this so would appreciate your patience and advice. Hywel Jones uHi Hywel, He have a dependency on the sweble library and the hosting server was down for a couple of days. This should be fixed by now, can you confirm? Best, Dimitris Hi Hywel, He have a dependency on the sweble library and the hosting server was down for a couple of days. This should be fixed by now, can you confirm? Best, Dimitris uDimitris, Thanks.  I've had another go and seem to get past the previous difficulty. I now get as far as: '[ERROR] scalac error: C:\Users\Hywel\Documents\dbpedia_extraction_frmwk\extraction-framework\core\target\classes does not exist or is not a directory.'  (Before that point I also got the following info message which would seem to suggest the extraction-framework is also missing another directory: '[INFO] skip non existing resourceDirectory C:\Users\Hywel\Documents\dbpedia_extraction_frmwk\extraction-framework\core\src\main\resources') The complete log is here:  Thanks for your help. Hywel From: Dimitris Kontokostas < > To: Hywel Jones < > Cc: \" \" < > Sent: Friday, 28 February 2014, 21:17 Subject: Re: [Dbpedia-discussion] Dump-based Extraction error Hi Hywel, He have a dependency on the sweble library and the hosting server was down for a couple of days.  This should be fixed by now, can you confirm? Best, Dimitris Dimitris, Thanks. I've had another go and seem to get past the previous difficulty. I now get as far as: '[ERROR] scalac error: C:\Users\Hywel\Documents\dbpedia_extraction_frmwk\extraction-framework\core\target\classes does not exist or is not a directory.' (Before that point I also got the following info message which would seem to suggest the extraction-framework is also missing another directory: '[INFO] skip non existing resourceDirectory C:\Users\Hywel\Documents\dbpedia_extraction_frmwk\extraction-framework\core\src\main\resources') The complete log is here: Dimitris uHi Hywel, This looks like a file / folder permission issue. Can you check the permissions in C:\Users\Hywel\Documents\dbpedia_extraction_frmwk\extraction-framework\* Cheers Dimitris On Mar 1, 2014 8:24 PM, \"Hywel Jones\" < > wrote: uDimitris, You were right. I just moved it to C:\extraction-framework-master and it then ran successfully. So, on to the next problem! I edited the download.minimal.properties file changing the target folder with: base-dir = C://Users//Public//wiki_dump_articles,  and changing language xx to cy with::  download=cy:pages-articles.xml.bz2 Dumbly following the instructions in section 4 of  cd dump and then using download.minimal.properties : /run download config=download.minimal.properties but that just gave me the following message: '' is not recognized as an internal or external command, operable program or batch file. I thought the / might be the problem so tried: mvn c:\extraction-framework-master\run download config=download.minimal.properties. This ran but failed with: [ERROR] No plugin found for prefix 'c' in the current project and in the plugin groups [org.apache.maven.plugins, org.codehaus.mojo] available from the repositories [local (C:\Users\gweinyddwr\.m2\repository), central ( 'Gweinyddwr' is administrator in Welsh so C:\Users\gweinyddwr\.m2\repository is the default (if I recall correctly) location chosen when maven was installed. Another permissions problem or something else? Hywel From: Dimitris Kontokostas < > To: Hywel Jones < > Cc: Sent: Sunday, 2 March 2014, 14:16 Subject: Re: [Dbpedia-discussion] Dump-based Extraction error Hi  Hywel, This looks like a file / folder permission issue. Can you check the permissions in C:\Users\Hywel\Documents\dbpedia_extraction_frmwk\extraction-framework\* Cheers Dimitris On Mar 1, 2014 8:24 PM, \"Hywel Jones\" < > wrote: Dimitris, uHi Hywel, Actually \"run\" is a linux script I updated the windows based instructions here: It would be great if someone could contribute an equivalent .bat file Cheers, Dimitris On Mon, Mar 3, 2014 at 12:35 AM, Hywel Jones < >wrote:" "Help with iOS and accessing DBPedia" "uHi all This is my first day to the mailing list, hope you are all well. So I have been scouring the web furiously but have just come upon disparate information on how to use sparql and dbpedia in iphone applications. I am new to semantic web and sparql, but need to find a way to be able to query the dbpedia live endpoint and parse its results in iOS. My problems 1. How do I send a SPARQL request to the DBPedia sparql Endpoint - a link to a tutorial or code samples would be perfect! 2. How does the result of the query get returned back and how do I consume it. 3. What can I use to parse the response of this request. Will one of the iOS XMLParsers be sufficient? Thanks all. Any links to good resources would be amazing. Google doesn't seem to be coming up with the goods at the moment Tony Hi all This is my first day to the mailing list, hope you are all well. So I have been scouring the web furiously but have just come upon disparate information on how to use sparql and dbpedia in iphone applications. I am new to semantic web and sparql, but need to find a way to be able to query the dbpedia live endpoint and parse its results in iOS. My problems 1. How do I send a SPARQL request to the DBPedia sparql Endpoint - a link to a tutorial or code samples would be perfect! 2. How does the result of the query get returned back and how do I consume it. 3. What can I use to parse the response of this request. Will one of the iOS XMLParsers be sufficient? Thanks all. Any links to good resources would be amazing. Google doesn't seem to be coming up with the goods at the moment Tony uHi Antonio, Basically any SPARQL interface is a restful interface, so you can create a sparql url by adding the query and format options as parameters. Consider the following url: Demo: Find me 100 example concepts in the DBPedia dataset. As a demonstration this uses the qtxt parameter to fill in the fields on the form so you can see what is going on. When you press the \"Run Query\" button, it will give you a table. If you look at the addressbar of your browser you will see that the &qtxt; has been replaced by &query; which triggers the /sparql endpoint to automatically execute the query. By setting the Format selector from HTML to Json or RDF/XML you can change the way the result set of your query is returned. Again after you have pressed the \"Run Query\" button, you can see how changes to the form change the &format; field in the addressbar. See above. I am seeing plenty of Google links that maybe of interest but you can start with: [1] [2] [3] Hope this helps Patrick uHi Patrick Thank you so much for your email! Super helpful to point out that SPARQL is restful. I'm not sure how I missed that in all the stuff I was reading. I guess the next step would be to form the web request in iOS and specify the format of the response to RDF XML. Think I'll be using GXML to parse the data. You rock, helped me completely! Tony On 9 January 2012 07:59, Patrick van Kleef < > wrote: Hi Patrick Thank you so much for your email! Super helpful to point out that SPARQL is restful. I'm not sure how I missed that in all the stuff I was reading. I guess the next step would be to form the web request in iOS and specify the format of the response to RDF XML. Think I'll be using GXML to parse the data. You rock, helped me completely! Tony On 9 January 2012 07:59, Patrick van Kleef < > wrote: Hi Antonio, This is my first day to the mailing list, hope you are all well. So I have been scouring the web furiously but have just come upon disparate information on how to use sparql and dbpedia in iphone applications. I am new to semantic web and sparql, but need to find a way to be able to query the dbpedia live endpoint and parse its results in iOS. My problems 1. How do I send a SPARQL request to the DBPedia sparql Endpoint - a link to a tutorial or code samples would be perfect! Basically any SPARQL interface is a restful interface, so you can create a sparql url by adding the query and format options as parameters. Consider the following url: Demo: Find me 100 example concepts in the DBPedia dataset. Patrick" "How dbpedia extractor decide the type or class of article?" "uHi, I'm sorry if I'm asking a basic question here. How the dbpedia extractor decide the class (< For example \" \"MeanOfTransportation\", \"FrontWheelDriveVehicles\", \"Sedans\", \"Automobile\" etc. Did the extractor read from \"Infobox\" properties?(e.g. body_style) only? What if the article did not consist of 'infobox'? Only a pure text of explanation of article. Will it read certain keyword inside the 'abstract' and 'guess' the class of article? Or, are there more complex mechanism happened at the engine before it decide the type/class of article? regards, sarif PS : Below is how I check the data from \" 1) First I check the list of class: SELECT ?class WHERE {?class a owl:Class} ORDER BY ?class limit 100 result: 2) Then, check the member of one of the class - e.g. Automobile SELECT ?member WHERE {?member a } limit 100 Result: http://dbpedia.org/resource/Honda_NSX http://dbpedia.org/resource/Mitsubishi_Starion http://dbpedia.org/resource/Dodge_Ramcharger http://dbpedia.org/resource/Plymouth_Fury 3) Next, to see all properties of one of the member SELECT ?p ?o WHERE { ?p ?o } Result: http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Thing http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/MeanOfTransportation http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/1990sAutomobiles http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/SportCompactCars http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/FrontWheelDriveVehicles http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/Sedans http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Automobile http://xmlns.com/foaf/0.1/page http://en.wikipedia.org/wiki/Honda_Integra Hi, I'm sorry if I'm asking a basic question here. How the dbpedia extractor decide the class (< http://www.w3.org/1999/02/22-rdf-syntax-ns#type >) of certain articles? For example ' http://en.wikipedia.org/wiki/Honda_Integra ' article is 'MeanOfTransportation', 'FrontWheelDriveVehicles', 'Sedans', 'Automobile' etc. Did the extractor read from 'Infobox' properties?(e.g. body_style) only? What if the article did not consist of 'infobox'? Only a pure text of explanation of article. Will it read certain keyword inside the 'abstract' and 'guess' the class of article? Or, are there more complex mechanism happened at the engine before it decide the type/class of article? regards, sarif PS : Below is how I check the data from ' http://dbpedia.org/sparql ' 1) First I check the list of class: SELECT ?class WHERE {?class a owl:Class} ORDER BY ?class limit 100 result: http://dbpedia.org/ontology/Automobile http://dbpedia.org/ontology/AutomobileEngine http://dbpedia.org/ontology/AutomobilePlatform http://dbpedia.org/ontology/Award 2) Then, check the member of one of the class - e.g. Automobile SELECT ?member WHERE {?member a < http://dbpedia.org/ontology/Automobile > } limit 100 Result: http://dbpedia.org/resource/Honda_Integra http://dbpedia.org/resource/Honda_NSX http://dbpedia.org/resource/Mitsubishi_Starion http://dbpedia.org/resource/Dodge_Ramcharger http://dbpedia.org/resource/Plymouth_Fury 3) Next, to see all properties of one of the member SELECT ?p ?o WHERE {< http://dbpedia.org/resource/Honda_Integra > ?p ?o } Result: http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Thing http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/MeanOfTransportation http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/1990sAutomobiles http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/SportCompactCars http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/FrontWheelDriveVehicles http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/Sedans http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Automobile http://xmlns.com/foaf/0.1/page http://en.wikipedia.org/wiki/Honda_Integra" "problem with query" "uhi i have a problem with this query in snorqul SELECT ?x ?y WHERE { ?x ?y OFFSET 0 it throws me this error: 37000 Error SP030: SPARQL compiler, line 12: Undefined namespace prefix at 'http' before '/' SPARQL query: define input:default-graph-uri PREFIX owl: PREFIX xsd: PREFIX rdfs: PREFIX rdf: PREFIX foaf: PREFIX dc: PREFIX : PREFIX dbpedia2: PREFIX dbpedia: PREFIX skos: SELECT ?x ?y WHERE { ?x ?y OFFSET 0 what is wrong with the query? greez uHi Faraz, URIs need to be enclosed in < > in SPARQL, so the query should be: SELECT ?x ?y WHERE { ?x ?y .} LIMIT 1000 OFFSET 0 Which returns the expected data: Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 19 Mar 2011, at 12:27, Faraz Fallahi wrote:" "How to extract profession of a person?" "uHi, In Wikipedia, for example, Steve Jobs is described as: *Steven Paul* \"*Steve*\" *Jobs* (/ > ˈ dʒ > ɒ b > z / ; There's \"dbpprop:occupation\" that returns title of occupations but misses the place name for occupations: Is there a way to extract the profession of a person from DBPedia at all? (e.g. 'entrepreneur', 'marketer', or 'inventor'.) — Best regards, Behrang Saeedzadeh Hi, In Wikipedia, for example, Steve Jobs is described as: Steven Paul  ' Steve ' Jobs  ( / ˈ dʒ ɒ b z / ; February 24, 1955 – October 5, 2011) [5] [6]  was an American entrepreneur , [7]  marketer, [8]  and inventor, [9]  who was the co-founder (along with Steve Wozniak  and Ronald Wayne ), chairman, and CEO of Apple Inc.  But in Saeedzadeh uHi Behrang, On Thu, Nov 21, 2013 at 12:05 PM, Behrang Saeedzadeh < >wrote: I don't think that any of these is an actual occupation and probably this is the reason they are not included in the infobox at all. This information is extracted from the infobox and this is how wikipedia has it | occupation = Co-founder, Chairman and CEO, [[Apple Inc.]] Co-founder and CEO, [[Pixar]] Founder and CEO, [[NeXT|NeXT Inc.]] Our framework interprets as a separate value and that is why each one has it's own triple You may also want to check the dbpedia-owl:occupation property, it has these values - dbpedia:NeXT - dbpedia:Pixar - dbpedia:Apple_Inc. and is generated from here: Best, Dimitris" "Post-extraction script help" "uHello everybody, I have a problem with post-extraction script, in the \"script\" directory. I don't understand why I can't run many of them, on the already extracted data. Do you have a wiki page for it, that I did't find ? With regards Raphael Boyer Hello everybody, I have a problem with post-extraction script, in the \"script\" directory. I don't understand why I can't run many of them, on the already extracted data. Do you have a wiki page for it, that I did't find ? With regards Raphael Boyer uWhat's the problem? Please send us error messages, stack traces, configuration parameters etc. The more details the better. On Mar 27, 2015 10:26 AM, \"Raphael Boyer\" < > wrote:" "DBpedia Lookup: Keyword Search vs Prefix Search" "uHi, I've tried to learn about Scala Code on Github: about the Keyword Search, I've seen that the label of the string to be searched was checked, and all the property \" About the Prefix Search, I don't understand what type of search the Lookup Service is doing, what type of properties are checked, what type of anchor text is used in Wikipedia to refer to the string to be searched. Is someone that could explain what are the main differences between the Keyword Search and the Prefix Search? In the Prefix Search, are present some NLP/NER tasks?I've found in the Github Source a Scala Class named \"pignlproc\", that refers to Apache Pig Libraries. Thanks in Advance. Kind Regards. Francesco Marchitelli Hi , DBpedia Lookup is a web service that can be used to look up DBpedia URIs by related keywords. Related means that either the label of a resource matches, or an anchor text that was frequently used in Wikipedia to refer to a specific resource matches (for example the resource Marchitelli uHello Lookup uses Lucene to generate indexes on various fields one one of which being \"Surface form Keyword\" on the labels. The Prefix Search of the query works in two steps: 1. Find every index term matching the regular expression . This is done through FuzzyQuery algorithm in Lucene which works on edit distance between the given query and the index term. 2. Run a MultiTermQuery looking for documents matching one of these terms. So prefix search is similar to the keyword seach with an additional query completion process. For Reference: Hope this helps. Best Regards Kunal On Mon, Jul 18, 2016 at 12:04 PM, Francesco Marchitelli <" "Non-well-formed RDF/XML" "uHi all, I've noticed that there are some resources in DBpedia that returned non-well-formed RDF/XML, e.g. Is that problem known? And is there a simple workaround for dealing with such non-well-formed RDF/XML documents, e.g., when processing them with Jena? Best, Heiko. uWorks for mewhat was the error message? Do you have other examples? On Jun 5, 2012 5:42 PM, \"Heiko Paulheim\" < > wrote: uI just noticed that Darmstadt.rdf contains lines like uThank you, you guys are just so quick. Cheers, Heiko. On Tue, 5 Jun 2012 19:36:06 +0300, Mitko Iliev < > wrote: >> with uTo be more accurate, this is a trick to make a triple valid in RDF/XML xml elements do not allow all characters, thus making the namespace ' > RDF/XML. The same applies for IRIs. Not all IRI characters are valid for an XML element thus serializing IRI predicates in RDF/XML may (or probably) result in invalid files. This is an example where the URI representation could result in valid RDF/XML using the 1st trick while not in IRI (because it ends with '?') My point is that since it is valid RDF we should not change only because of one spec Cheers Dimitris In short, all properties in DBpedia 3.8 can be represented in RDF/XML. uOn Wed, Jun 6, 2012 at 12:03 PM, Dimitris Kontokostas < > wrote: I understand your concerns. I'm also not a big fan of this underscore trick, but in the end I think the advantages outweigh the disadvantages: - only properties from the affected - the generic properties extracted from templates / infoboxes without much cleanup. - only about one in a thousand of these triples is affected - in the last few months, there have been quite a few bug reports and mailing list requests about broken RDF/XML. OpenLink updated Virtuoso so that these properties are commented out in RDF/XML, but this may also confuse some users. Cheers, JC" "Get Resource name and Category" "uDear all, I want to get list of all names of resources which is used for searching in DBPedia e.g Shahid_Afridi, Barrack_Obama, etc etc. How can i get that WHOLE list?? Also i want to get their categories e.g Kevin_Peterson belong to Athlete/Sports/cricketer categories. How and using which Tag i can retrieve all these categories for all resources? uHi Hamza, On 03/02/2013 07:32 AM, Hamza Asad wrote: Try the following query, if you want a list of all people in DBpedia: SELECT * WHERE {?person a dbpedia-owl:Person. ?person rdfs:label ?personName. ?person ?category } LIMIT 1000 You can complete the list by using the same query but with increasing the \"OFFSET\" in each subsequent call." "Double underscore pattern" "uHi all I am trying to parse the data of dbpedia 3.6 file mappingbased_properties_en.nt into mysql db. I have found some unexpected pattern like shown below. By checking the Wikipedia page, it's showing Alexander Graham Bell occupation Inventor So why do we need extra property called title like below ? Could anyone shed some lights ? < Thanks William Hi all I am trying to parse the data of dbpedia 3.6 file mappingbased_properties_en.nt into mysql db. I have found some unexpected pattern like shown below. By checking the Wikipedia page, it's showing Alexander Graham Bell occupation Inventor So why do we need extra property called title like below ? Could anyone shed some lights ? < William" "DateTime where Year expected (e.g. Stephen_Fry entry)" "uDuty calls - something's wrong on the Internet - I ran into problems consuming dbpedia data with Gremlin, Problem is '1981-01-01T00:00:00+02:00' ; I doubt it's just that page, but am not sure which templates need fixing exactly. Excerpt, \"As an xsd:dateTime value, it's just fine. The problem is that DBpedia is using xsd:gYear instead. Evidently something like this is intended: dbr:Stephen_Fry dbpediaowl:activeYearsStartYear \"1981\"^^xsd:gYear . dbr:Stephen_Fry dbpediaowl:birthYear \"1957\"^^xsd:gYear . But this is what is actually being published: dbr:Stephen_Fry dbpediaowl:activeYearsStartYear \"1981-01-01T00:00:00+02:00\"^^xsd:gYear . dbr:Stephen_Fry dbpediaowl:birthYear \"1957-01-01T00:00:00+02:00\"^^xsd:gYear .\" Thanks for any help fixing this at source. cheers, Dan uHi Dan, Tough task but someone has to do it :) At first thanks for the report, I checked the extraction framework and the DBpedia 3.9 data sets and everything seems fine so, I guess it has to do with the Virtuoso server. Best, Dimitris On Wed, Nov 7, 2012 at 3:19 PM, Dan Brickley < > wrote: u0€ *†H†÷  €0€1 0 +" "Introduction to GSoC 2017" "uHello devs, I am Richhiey Thomas and am studying CS in Mumbai University. I'd like to participate in GSoC 2017 with DBPedia. I went through the information for GSoC students and people new to DBpedia and had an easy time getting introduced to the project. After looking at the basic instructions for students, I've setup the DBpedia extraction framework with Intellij IDEA and am currently trying to get a hang of how the framework works by looking into the codebase and its documentation. Going through past GSoC pages and the starter pages helped me have a good start. I will try my best to start solving bugs or issues to get an idea of how things really work by the time the ideas for this year are up :) With respect to background, I had participated in GSoC 2016 with Xapian Search Engine Library where my project was based on 'Clustering of Search Results'. I also have a decent command over Python, C++ (and thus OOP) and had started learning Scala recently. So this gives me a chance to look forward to the language :D I would love to know what more I can do to get involved! Thanks. Hello devs, I am Richhiey Thomas and am studying CS in Mumbai University. I'd like to participate in GSoC 2017 with DBPedia. I went through the information for GSoC students and people new to DBpedia and had an easy time getting introduced to the project. After looking at the basic instructions for students, I've setup the DBpedia extraction framework with Intellij IDEA and am currently trying to get a hang of how the framework works by looking into the codebase and its documentation. Going through past GSoC pages and the starter pages helped me have a good start. I will try my best to start solving bugs or issues to get an idea of how things really work by the time the ideas for this year are up :) With respect to background, I had participated in GSoC 2016 with Xapian Search Engine Library where my project was based on 'Clustering of Search Results'. I also have a decent command over Python, C++ (and thus OOP) and had started learning Scala recently. So this gives me a chance to look forward to the language :D I would love to know what more I can do to get involved! Thanks." "More information about how persons are extracted into persondata_en.nt, more about topical_concepts_en.nt instance_types.nt" "uHi all, i like to know more about these following two things, i hope you can give me some hints and information: - how are persons extracted into the dump file \"persondata_en.nt\"? Where can i get more information on that? Any good papers or other resources? - where can i get more information about the dump files \"topical_concepts_en.nt\" and \"instance_types.nt\", especially relating this information: \"The English version of the DBpedia knowledge base currently describes 3.77 million things, out of which 2.35 million are classified in a consistent Ontology, including 764,000 persons, 573,000 places (including 387,000 populated places), 333,000 creative works (including 112,000 music albums, 72,000 films and 18,000 video games), 192,000 organizations (including 45,000 companies and 42,000 educational institutions), 202,000 species and 5,500 diseases.\". There is no further information, especially not at best regards Hi all, i like to know more about these following two things, i hope you can give me some hints and information: - how are persons extracted into the dump file \"persondata_en.nt\"? Where can i get more information on that? Any good papers or other resources? - where can i get more information about the dump files \"topical_concepts_en.nt\" and \"instance_types.nt\", especially relating this information: \" The English version of the DBpedia knowledge base currently describes 3.77 million things, out of which 2.35 million are classified in a consistent Ontology , including 764,000 persons, 573,000 places (including 387,000 populated places), 333,000 creative works (including 112,000 music albums, 72,000 films and 18,000 video games), 192,000 organizations (including 45,000 companies and 42,000 educational institutions), 202,000 species and 5,500 diseases. \". There is no further information, especially not at regards uHi, On 03/08/2013 04:47 PM, wrote: I would recommend you to read that paper [1]. Did you have a look on that [1]? [1] [2] DBpedia_datasets" "Use stub templates' mapping to detect resource type" "uHi all, now that the extraction framework handles multiple templates from a wikipage, I am wondering if there is any counterindication in using stub templates to extract rdf:type properties. Examples: clearly states that an article is about a Christian Bishop. That's why I have created this: The extraction works as expected: Hi all, now that the extraction framework handles multiple templates from a wikipage, I am wondering if there is any counterindication in using stub templates to extract rdf:type properties. Examples: expected: u uNo and this was one of the reason we implement that feature. We tried to do the same trick in DBpedia Greek in the early days but came across the new \"blank node\" resources Cheers, Dimitris On Thu, Nov 21, 2013 at 3:41 PM, Andrea Di Menna < > wrote:" "Querying DbPedia to get country datas" "uHello, I need to make the profile of all countries of the world (area, population, currency, etc) So I would like to have informations countained in infosboxes of all wikipedia country pages. (look at this exemple: I'm sure DBPedia countains this informations but I can't understand how I can get them! Could you help me with an exemple or something i can use? Thank you very much. uHello, Petite Escalope schrieb: How familiar are you with Semantic Web technologies? You can get this information by performing a SPARQL query at would build the query by selecting instances of dbpedia-owl:Country ( select * where { ?country a dbpedia-owl:Country . ?country dbpedia-owl:areaMetro ?area } The difficult part is to make the query as complete as possible, in particular if the property you are looking for is not in the DBpedia ontology (i.e. its URI is starting with In that case you need to query for several properties and use OPTIONAL patterns in your query. Kind regards, Jens" "default limitation for fetching triples from dbpedia endpoint" "uHi, Is there any default limitation in sparql queries from dbpedia endpoint?( I think the maximum of triples that can be fetched is 2000 triples for every query, how can I omit this limitation?I'll be so pleased for your helping Thanks a lot, Sareh uHi Sareh, The number of connections/sec you can make, as well as restrictions on resultset and query time, as per the following settings: [SPARQL] ResultSetMaxRows = 2000 MaxQueryExecutionTime = 120 MaxQueryCostEstimationTime = 1500 These are in place to make sure that everyone has a equal chance to de- reference data from dbpedia.org, as well as to guard against badly written queries/robots. The following options are at your disposal to get round these limitations: 1. Use the LIMIT and OFFSET keywords You can tell a SPARQL query to return a partial result set and how many records to skip e.g.: select ?s where { ?s a ?o } LIMIT 1000 OFFSET 2000 2. Setup a dbpedia database in your own network The dbpedia project provides full datasets, so you can setup your own installation on a sufficiently powerful box using Virtuoso Open Source Edition. 3. Setup a preconfigured installation of Virtuoso + database using Amazon EC2 (not free) See: Please let me know if you have any questions on this or any other dbpedia sparql endpoint issue. Best regards, Patrick uThanks for your reply From: Patrick van Kleef < > To: sareh aghaei < > Cc: Sent: Fri, June 24, 2011 2:20:58 PM Subject: Re: [Dbpedia-discussion] default limitation for fetching triples from dbpedia endpoint Hi Sareh, The connections/sec you can make, as well as restrictions on resultset and query time, as per the following settings: [SPARQL] ResultSetMaxRows = 2000 MaxQueryExecutionTime = 120 MaxQueryCostEstimationTime = 1500 These are in place to make sure that everyone has a equal chance to de-reference data from dbpedia.org, as well as to guard against badly written queries/robots. The following options are at your disposal to get round these limitations: 1. Use the LIMIT and OFFSET keywords You can tell a SPARQL query to return a partial result set and how many records to skip e.g.: select ?s where { ?s a ?o } LIMIT 1000 OFFSET 2000 2. Setup a dbpedia database in your own network The dbpedia project provides full datasets, so you can setup your own installation on a sufficiently powerful box using Virtuoso Open Source Edition. 3. Setup a preconfigured installation of Virtuoso + database using Amazon EC2 (not free) See: Please let me know if you have any questions on this or any other dbpedia sparql endpoint issue. Best regards, Patrick" "keyword search not working anyomore ?" "uHi, I am using the DBpedia Lookup Services Prefix Search and Keyword Search in my program. Last time I tested it should be about a week ago and both worked. Today tested it and Keyword Search was not returning any results. Btw. Prefix Search is still working. When I am using the example given at Keyword Search: &QueryString;=berlin Dbpedia is also not returning any results. Has the address of the Keyword Lookup Service been changed or is the system broken? Best regards Sven" "nested templates; multiple classes" "uPLEASE let’s continue this discussion at From: Dimitris Kontokostas [mailto: ] The mappings extractor can't handle nested templates: As I said in another thread, it is very trivial to change but needs testing any volunteers from the community? I can provide an adapted version of the code and also dumps but someone needs to look at the data Sure: Boyan can deploy it locally, and I’ll look at the data. Gimme test cases (AT THE URL above). So far I got: * Film_date ( * Elvis should be Person, MilitaryPerson and MusicalArtist ( Is the logic \"pick one out of several disjoint classes\" documented precisely somewhere? And use cases/test cases? @jcsahnwaldt? I don't know but I have an uneasy feeling about such logic. If templateA says classA and templateB says classB, seems to me the extractor itself can't make an intelligent decision to drop one of them. * Either the maps are correct and both classes should be emitted (and what an ontologist thought are disjoint classes, the data proves are not) * Or a map needs to be fixed (eg Listen is not a class but an IntermediateNodeMapping with class Sound and relation soundRecording) * Or a template is wrongly applied (eg in bgwiki, \"Musical Artist\" was mis-applied to \"BG at the World Cup 1994\") Don't see room for Artificial Intelligence here ;-)" "Why Infobox_Geopolitical_organization (eg United_Nations) is mapped to Country?" "uThe page for United Nations: Uses \"Infobox geopolitical organization\" The mapping of this infobox has mapToClass = Organisation But the dbpedia data has type dbo:Country, and NOT dbo:Organisation Checking with the latest dump & mapping & extractor shows the same error. How is this happening? uThat's because \"Infobox geopolitical organization\" in Wikipedia itself is redirected to the country infobox: And the extraction framework uses the redirect dump of Wikipedia - otherwise lots of mappings created years ago for infoboxes that have been renamed in the meanwhile won't be found. Volha On 12/13/2014 4:29 PM, Vladimir Alexiev wrote: uBut - Country or territory - Geopolitical organization The fields in these are different (see attached). I think \"Geopolitical organization\" can be recognized by one of these 3 fields: |org_type =" "Videogame release dates" "uHere is another report of quality problems. How to solve them is worth discussion. I'm making something simple that, given a person, looks up creative works they are responsible for, looks up the dates of those creative works, then subtracts the birth date of those to get the age at which they created something and then makes a report. When I GET the prolific character designer I can find the games she did art for by following the backward dbo:gameArtist links to the games she designed, which are returned by the GET request. Then I can GET the games, but when I do so, the release dates are often incorrect, for instance has a release date back in the 1930's, which of course predates Tsunako and is invalid. The root cause is that there is an awful blob in the infobox that contains multiple release dates in various geographic regions. In my case, the standard of quality is that I want the earliest release date but I'm not too excited if I am off by ±1 year. (She might have done the illustrations in the prior year, etc.) The above one is obviously absurd and easy to catch, but exhibits a much more insidious error where it gets the 2015 release date of the Windows edition of a remade and heavily modified version (different combat system, different world travel, new voice acting, new music, ) which might pass by you if you're not the kind of person who drinks Nep Bull. Interstingly, Wikidata gets the release date right (by my definition) and claims it got it from Wikipedia. It's a run-of-the-mill kind of quality problem that affects users, but it gets into all the questions of \"what exactly do you want to model?\" as clearly the Wikipedia editors are trying to model it at a very fine grain but users might want a spectrum of different granularities. uThanks for the report Paul, These kind of cases are haunting us since the beginning of DBpedia. There are two cases here 1) flexibility of the mappings wiki to define fine grained extraction rules 2) representation uniformity of the data in Wikipedia for parsing them correctly For (1) we are working with Gent University on moving the mappings to RML which will give us great flexibility Fixing (1) will not help us much with (2) though since Wikipedia users might not put the right data in the right place and proper format for (2) we are working on fixing this problem by integrating data from multiple sources (inc Wikidata) and trying to resolve conflicts etc wrt Wikidata, the fact that Wikidata has this right and the value is taken from WIkipedia can have different interpretations. It could have been entered manually or there is a video game bot that is tweaked to parse these \"awful blobs\". Either way, it is good that it is there :) Cheers, Dimitris On Tue, Jan 24, 2017 at 6:14 PM, Paul Houle < > wrote:" "DBpedia updates/refreshes from wiki data" "uI had a quick question. How often does dbpedia update based on wiki changes, and is there a way to manually ask it to update? I've noticed a few items that seem to be a good bit out of date compared to the wikidata versions, and was hoping to correct some of the information in it. Thanks for the help. -Fshy I had a quick question. How often does dbpedia update based on wiki changes, and is there a way to manually ask it to update? I've noticed a few items that seem to be a good bit out of date compared to the wikidata versions, and was hoping to correct some of the information in it. Thanks for the help. -Fshy uThe latest DBpedia is based on dumps from March/April 2016, and has been released in September 2016. Unfortunately, the DataID [ Right now, DBpedia’s update cycle is (intended to be) half-yearly, so 2016-10 will become the next version. DBpedia Live does not consider wikidata at the moment. uThank you for making me aware of the missing information about the dump file. I found the cause and updated the dataids. You can find it here: 2016-04_dataid_wikidata.json under dc:issued: { @type: \"xsd:date\", @value: \"2016-03-05\" }, Best, Markus Freudenberg Release Manager, DBpedia On Tue, Nov 15, 2016 at 10:00 PM, Magnus Knuth < > wrote:" "DBpedia content negotiation" "uHi again, We did some more tests against DBpedia resources on our framework, it looks like DBpedia does not do content negotiation properly in case it does not support the client requested MIME-type: curl -v -H \"Accept: text/turtle\" * About to connect() to dbpedia.org port 80 (#0) * Trying 194.109.129.58* connected * Connected to dbpedia.org (194.109.129.58) port 80 (#0) OpenSSL/1.0.1b zlib/1.2.7 libidn/1.22 < HTTP/1.1 303 See Other < Date: Wed, 30 May 2012 20:03:31 GMT < Content-Type: text/html; charset=UTF-8 < Connection: keep-alive < Server: Virtuoso/06.04.3132 (Linux) x86_64-generic-linux-glibc25-64 VDB < Location: < Content-Length: 0 < * Connection #0 to host dbpedia.org left intact * Closing connection #0 The \"303 see other\" is wrong in this case, it should instead return a \"406 not acceptable\". Otherwise the client expects that he gets the requested MIME-type at the other location. With 406 it could return the HTML type as \"alternates\", that would be correct. see: If I do it with a list of supported MIME-types (including one which is returned by DBpedia) it is correct: % curl -v -H \"Accept: text/turtle,application/rdf+xml\" * About to connect() to dbpedia.org port 80 (#0) * Trying 194.109.129.58* connected * Connected to dbpedia.org (194.109.129.58) port 80 (#0) OpenSSL/1.0.1b zlib/1.2.7 libidn/1.22 < HTTP/1.1 303 See Other < Date: Wed, 30 May 2012 20:15:07 GMT < Content-Type: application/rdf+xml; qs=0.95 < Connection: keep-alive < Server: Virtuoso/06.04.3132 (Linux) x86_64-generic-linux-glibc25-64 VDB < Accept-Ranges: bytes < TCN: choice < Vary: negotiate,accept < Content-Location: /data/Tim_Berners-Lee.xml < Link: ; rel=\"timegate\" < Location: < Content-Length: 0 < * Connection #0 to host dbpedia.org left intact * Closing connection #0 So now it redirects me with 303 to the rdf+xml version, which is an accepted MIME-type from the client. BTW I found it correctly described on a Virtuoso page ;) BTW2 any reasons that turtle is not supported? IMHO rdf+xml has to become extinct and replaced by turtle. Today someone asked me if they have to learn XML first to understand RDF because many samples in RDF books are in RDF/XML. That does not help adoption of RDF in the real world ;) cu Adrian uOn 5/30/12 4:28 PM, Adrian Gschwend wrote: See: . We need fix the re-write rule bug shown here: . uHello Adrian, On Wed, 2012-05-30 at 22:28 +0200, Adrian Gschwend wrote: I guess that's because some configs should be reviewed there. At the very beginning, only couple of formats were supported and turtle had \"x-\" prefix of experimental/custom status. Now the internals of Virtuoso recognize the following MIMEs: For SPARQL result-sets: \"text/rdf+n3\" , \"TTL\" \"text/rdf+ttl\" , \"TTL\" \"text/rdf+turtle\" , \"TTL\" \"text/turtle\" , \"TTL\" \"text/n3\" , \"TTL\" \"application/turtle\" , \"TTL\" \"application/x-turtle\" , \"TTL\" \"application/sparql-results+json\" , \"JSON;RES\" \"application/json\" , \"JSON\" \"application/soap+xml\" , \"SOAP\" \"application/soap+xml;11\" , \"SOAP\" \"application/sparql-results+xml\" , \"XML\" \"text/html\" , \"HTML\" \"application/vnd.ms-excel\" , \"HTML\" \"application/javascript\" , \"JS\" \"application/rdf+xml\" , \"RDFXML\" \"application/atom+xml\" , \"ATOM;XML\" \"application/odata+json\" , \"JSON;ODATA\" \"text/rdf+nt\" , \"NT\" \"text/plain\" , \"NT\" \"text/cxml+qrcode\" , \"CXML\" \"text/cxml\" , \"CXML\" \"text/csv\" , \"CSV\" For triples: \"application/x-trig\" , \"TRIG\" \"text/rdf+n3\" , \"TTL\" \"text/rdf+ttl\" , \"TTL\" \"text/rdf+turtle\" , \"TTL\" \"text/turtle\" , \"TTL\" \"text/n3\" , \"TTL\" \"application/turtle\" , \"TTL\" \"application/x-turtle\" , \"TTL\" \"application/json\" , \"JSON\" \"application/rdf+json\" , \"JSON;TALIS\" \"application/x-rdf+json\" , \"JSON;TALIS\" \"application/soap+xml\" , \"SOAP\" \"application/soap+xml;11\" , \"SOAP\" \"application/rdf+xml\" , \"RDFXML\" \"text/rdf+nt\" , \"NT\" \"application/xhtml+xml\" , \"RDFA;XHTML\" \"text/plain\" , \"NT\" \"application/sparql-results+json\" , \"JSON;RES\" \"text/html\" , \"HTML;MICRODATA\" \"application/vnd.ms-excel\" , \"HTML\" \"application/javascript\" , \"JS\" \"application/atom+xml\" , \"ATOM;XML\" \"application/odata+json\" , \"JSON;ODATA\" \"application/sparql-results+xml\" , \"XML\" \"text/cxml+qrcode\" , \"CXML;QRCODE\" \"text/cxml\" , \"CXML\" \"text/x-html+ul\" , \"HTML;UL\" \"text/x-html+tr\" , \"HTML;TR\" \"text/md+html\" , \"HTML;MICRODATA\" \"text/microdata+html\" , \"HTML;MICRODATA\" \"application/microdata+json\" , \"JSON;MICRODATA\" \"application/x-json+ld\" , \"JSON;LD\" \"application/ld+json\" , \"JSON;LD\" \"text/csv\" , \"CSV\" When everything is tweaked completely, the query page should recognize MIMEs of both lists and descriptions of resources uOn 31.05.12 08:12, Ivan Mikhailov wrote: Hi Ivan, ok that makes more sense, I run my own Virtuoso instance and in there I could get turtle :-) Would be great if that could be enabled on DBpedia as well later. cu Adrian uOn 30.05.12 23:25, Kingsley Idehen wrote: Hi Kingsley, ok that makes sense with what Ivan said. would be appreciated. tnx! Adrian uOn 5/31/12 4:50 AM, Adrian Gschwend wrote: Of course it will be enabled, the issue is a bug in the TCN (Transparent Content Negotiation) QoS algorithm that drives the re-write rules for DBpedia. It will be fixed (if not already the case). Kingsley" "Fw: dbpedia/snorql and dbpedia query through jena" "uHello, What is going on with Your are here: This page doesn't exist yet. Maybe you want to create it? I am trying to execute the following query through Jena. String prefixes = \"PREFIX owl: \n\" +         \"PREFIX xsd: \n\"+         \"PREFIX rdfs: \n\"+         \"PREFIX rdf: \n\"+         \"PREFIX foaf: \n\"+         \"PREFIX dc: \n\"+         \"PREFIX : \n\"+         \"PREFIX dbpedia2: \n\"+         \"PREFIX dbpedia: \n\"+         \"PREFIX skos: \n\"+          \"PREFIX dbpedia-owl: \n\"+         \"PREFIX geo: \n\";                 String sparqlQueryString = prefixes +         \"SELECT ?resource ?page ?icon ?lat ?lon WHERE {\n\" +         \"?resource foaf:name \\"\"+name+\"\\"@en ;\n\" +          \" foaf:page ?page ;\n\" +         \"dbpedia-owl:thumbnail ?icon ;\n\" +         \"geo:lat ?lat ;\n\" +         \"geo:long ?lon.\n\" +         \"}\";                 Query query = QueryFactory.create(sparqlQueryString);         QueryExecution qexec = QueryExecutionFactory.sparqlService(\"         try {                        ResultSet results = qexec.execSelect(); } and I get no results but when i execute the same query in Could you help me please?? Hello, What is going on with < + \"geo:long ?lon.\n\" + \"}\"; Query query = QueryFactory.create(sparqlQueryString); QueryExecution qexec = QueryExecutionFactory.sparqlService(\" uHello, My query used to work. My java machine crashed. That was the problem. As far as Thank you." "multi language extraction support" "uHi, I'm working on a project that aims linked data extraction from Turkish wikipedia content. Initially, I thought dbpedia data extraction framework can handle such objective, but after some investigation, I have realized that framework code includes some language specific parts, so I think such an effort won't be a straight forward process Actually, I have some questions about development environment and framework code: 1. I can't setup development environment with Eclipse. It is not possible to compile framework code using Eclipse environment. I have followed the instructions on dbpedia wiki ( I have tried some different version combinations for Eclipse, scala compiler, scala plugin, but no success Is there any one who use Eclipse IDE for development ? Any suggestions or different ways to setup environment ? Is it possible ?? or not By the way, IntelliJ IDEA compiles & runs the code with no hassle 2. In the light of previous discussions, I think there are some work for multi-national code base to support other languages. I have entered some Turkish infobox mappings and investigated & run the code, at this step, how can I proceed ? I can contribute the project for Turkish mappings and codebase for multi-language support. Besides, I have registered developer & discussion mailing lists with user name \"halilayyildiz\". Regards, Halil Ayyıldız Hi, I'm working on a project that aims linked data extraction from Turkish wikipedia content. Initially, I thought dbpedia data extraction framework can handle such objective, but after some investigation, I have realized that framework code includes some language specific parts, so I think such an effort won't be a straight forward processActually, I have some questions about development environment and framework code: 1.  I can't setup development environment with Eclipse. It is not possible to compile framework code using Eclipse environment. I have followed the instructions on dbpedia wiki ( Ayyıldız uThanks for your responses, especially for updated wiki page :) It is strange that my editor rights are disabled, I was able to add one mapping beforeAnyway, can I have editor rights again to add Turkish mappings ? username: halilayyildiz Regards, Halil Thanks for your responses, especially for updated wiki page :) It is strange that my editor rights are disabled, I was able to add one mapping beforeAnyway, can I have editor rights again to add Turkish mappings ? username: halilayyildiz Halil uHi Halil, you still have editor rights. What happens if you edit a mapping? Cheers, Anja On Apr 26, 2011, at 9:33 AM, Halil AYYILDIZ wrote:" "Queries on live.dbpedia.org return results with colon encoded as %3A in Category URLs" "uI'm using a sparql query on live.dbpedia.org to traverse a hierarchy of categories and return property values from pages in those categories.  There seems to be an issue where the URLs for some of the categories use \"%3A\" instead of \":\" which prevents the query from finding pages in that category. The example query below, when run on live.dbpedia.org/sparql shows that the first 32 categories have this issue, while the rest is ok. SELECT ?category replace(str(?category),'%3A',':') as ?category_fixed WHERE {   { SELECT ?category ?y WHERE {     ?category skos:broader ?y .     ?category rdfs:label ?categoryName .     FILTER (regex(?categoryName, \"power station\", \"i\") ||     regex(?categoryName, \"power plant\", \"i\") ||     regex(?categoryName, \"CHP plants\", \"i\") ||     regex(?categoryName, \"Wave farms\", \"i\") ||     regex(?categoryName, \"Wind farms\", \"i\")) .   } }   OPTION ( TRANSITIVE, T_DISTINCT, t_in(?category), t_out(?y), t_step('path_id') as ?path, t_step(?category) as ?route, t_step('step_no') AS ?jump, T_DIRECTION 2 )   FILTER ( ?y = ) .   OPTIONAL { ?plant dc:subject ?category } . } group by ?category order by ?category limit 50 Is this a known issue?  Is there a workaround that I can employ? Regards, Chris I'm using a sparql query on live.dbpedia.org to traverse a hierarchy of categories and return property values from pages in those categories. There seems to be an issue where the URLs for some of the categories use \"%3A\" instead of \":\" which prevents the query from finding pages in that category. The example query below, when run on live.dbpedia.org/sparql shows that the first 32 categories have this issue, while the rest is ok. SELECT ?category replace(str(?category),'%3A',':') as ?category_fixed WHERE { { SELECT ?category ?y WHERE { ?category skos:broader ?y . ?category rdfs:label ?categoryName . FILTER (regex(?categoryName, \"power station\", \"i\") || regex(?categoryName, \"power plant\", \"i\") || regex(?categoryName, \"CHP plants\", \"i\") || regex(?categoryName, \"Wave farms\", \"i\") || regex(?categoryName, \"Wind farms\", \"i\")) . } } OPTION ( TRANSITIVE, T_DISTINCT, t_in(?category), t_out(?y), t_step('path_id') as ?path, t_step(?category) as ?route, t_step('step_no') AS ?jump, T_DIRECTION 2 ) FILTER ( ?y = < Chris" "User contributed data to DBpedia, was: Add your links to DBpedia workflow version" "uHi Søren, (I renamed the topic and replaced DBpedia developers list with discussion) in general, we would like to improve nothing in DBpedia, whcih should be fixed in Wikipedia. So population numbers should be fixed there (or WikiData soon hopefully). We really do not want to become something like Freebase, as DBpedia should stay a Semantic Web mirror of Wikipedia. By the way, wouldn't it be sufficient to link to EuroStats? On the other hand, it might make sense to create additional structures over DBpedia identifiers. Yago, umbel and schema.org already does this and provides their hierarchy for DBpedia to include. This should, however, not overlap with the DBpedia ontology. There are also plans to extend the Mappings Wiki, so everybody can customize mappings for personal use cases. This is not an easy topic however." "How To Do Deal with the Subjective Issue of Data Quality?" "uAll, Apologies for cross posting this repeatedly. I think I have a typo free heading for this topic. Increasingly, the issue of data quality pops up as an impediment to Linked Data value proposition comprehension and eventual exploitation. The same issue even appears to emerge in conversations that relate to \"sense making\" endeavors that benefit from things such as OWL reasoning e.g., when resolving the multiple Identifiers with a common Referent via owl:sameAs or exploitation of fuzzy rules based on InverseFunctionProperty relations. Personally, I subscribe to the doctrine that \"data quality\" is like \"beauty\" it lies strictly in the eyes of the beholder i.e., a function of said beholders \"context lenses\". I am posting primarily to open up a discussion thread for this important topic. uOn 4/7/11 2:58 PM, Gregg Reynolds wrote: Context is important. Unfortunately, trying to do the right thing in cyberspace just worsens matters, exponentially. I was trying assist someone else who seems to be preoccupied with the subject matter in question but simply won't take my suggestion re. starting a new thread. Two typo ladden headings later, here we are :-( Yes, in a nutshell. uOn 4/7/11 3:06 PM, Jiří Procházka wrote: +1 And it shouldn't be used as a perennial distraction mechanism re. Linked Data. There is no such thing as perfect data. +1 Thus when dealing with data driven *anything* the following separation of powers remain: 1. Data Presentation 2. Data Representation 3. Data Access Protocol 4. Data Query Language 5. Data Model 6. Actual Data accessible from a Location . The separations above are sometimes overlooked in the context of many Linked Data initiatives and demos. Inaccurate data at an Address doesn't render applications, services, or demos scoped to points 1-5 (above) useless. In fact, bad data can be very useful [1] :-) Links: 1. it_turns_out_bo.html uOn 4/7/11 8:59 PM, Michael F Uschold wrote:" "Share your Internationalized SPARQL queries (apache log files)" "uDear all, although this mail is mainly intended to the DBpedia Internationalization Committee members, I consider that it can be interesting to people managing SPARQL EndPoints (from private/public institutions or companies). If this is not the case for you I am very sorry for the spam. In the context of the 1st DBpedia Community Meeting , I will give a talk about esDBpedia , the Spanish chapter of DBpedia. It has been running for 20 months, logging all the SPARQL requests (apache logs). In this talk I will show you what I have found digging in the last 52 weeks logs (22 million SPARQL queries). If you are interested in the scripts developed to process this huge amount of information, please, do not hesitate and contact me. If you attend the meeting, please, take your logs with you (in an external hard disk) and look for me. It will be my pleasure to share experiences, data and publications. N.B.: Anonymization policies will be applied to files. Best regards," "English Wikipedia full dump finished for thefirst time since 2006" "uHi Brian, thanks for the hint. Will be a nice challenge when we run our extraction job for the next time. Have a nice Sunday. Chris" "UMBEL v 1.00" "uHi All, Structured Dynamics and Ontotext have just released version 1.00 of UMBEL. This version is the first production-grade release of this open source, reference ontology for the Web. For more information and downloads, please see In broad terms, here is what is included in the new version 1.00: * A core structure of 27,917 reference concepts (RCs), an increase of 36% over the prior version * The clustering of those concepts into 33 mostly disjoint SuperTypes (STs) * Direct RC mapping to 444 PROTON classes * Direct RC mapping to 257 DBpedia ontology classes * An incomplete mapping to 671 GeoNames features * Direct mapping of 16,884 RCs to Wikipedia (categories and pages); 60% of UMBEL is now mapped to Wikipedia * The linking of 2,130,021 unique Wikipedia pages via 3,935,148 predicate relations; all are characterized by one or more STs with 876,125 also assigned a specific type * And, some vocabulary changes, including some new and some dropped predicates. UMBEL's basic vocabulary can also be used for constructing specific domain ontologies that can easily interoperate with this and other systems. Enjoy! We solicit and welcome your comments and contributions." "object property extractor should check    rdfs:range" "uHi Vladimir, the lack of constraint enforcing in the DBpedia dataset was something I discovered some time ago. Still there are several reasons for that, I guess: 1. As Pablo Mendes explained in some past discussion - you can treat the property constraints not as real constraints, but just additional data points. That's why they are not enforced, since this would cause much data to be lost, but usually more data is better (at least in a statistics-based approach). 2. The second reason is that many of the DBpedia resources don't have any type assigned. So these data would be lost as well, even though most of them is valid. 3. If you only need data that is compatible with the constraints, you can enforce that on your own. Cheers, Aleksander uIMHO low quality data is never better. Don't know what you mean by \"statistics-based approach\", but if you mean Machine Learning, bad data is disastrous to learn from. Not always. If bad data is statistically negligible it doesn't make too much harm. On the other hand - in logic-based approach - bad data makes much more harm. By statistics-based approach I mean whatever method that bases inferences on counting of events. By logic-based approach I mean methods that does not take counting into account. If there is only one invalid assertion in 10.000 a statistics-based method will just ignore it, but logic-based method will draw false conclusions. I disagree that data quality should be left to a \"do it yourself\" approach. Well, it depends on the POV. As a data consumer you might be interested in the data with the best quality, but as a producer, you might be interested in providing more data, that the consumers will take care of. Cf. DBpedia approach towards properties. At the beginning it included data that was directly extracted from the infoboxes. At present these data are supplemented with data obtained via property mapping. But the original data extracted without mapping are still available. Since DBpedia is a community effort, you should not expect that everything will be perfect, since there is usually a shortage of human resources to implement all the great ideas. That's why DIY might be the only available option. Do you have specific objections to the DBO ontology? I have some, mostly related to redundant/non-orthogonal properties. But I don’t see serious defects in the class hierarchy. Well, I haven't made any in-depth study regarding the class hierarchy, but I doubt building a general purpose ontology might be done this way. Moreover there are other ontologies: Dolce, Sumo, Cyc, BFO, etc. that are better suited for this task. Regarding specific issues - e.g. the lack of is-a/genls distinction in type assignment, mixing of roles with semantic/ontological categories, lack of coverage in a number of topics, lack of higher order concepts, to name just a few. Most of the classes lack NL definitions, which makes everything even harder. I'm not going into properties, since this is a mess. Data should not just be thrown away: the extractor should also make a list of exceptions, to be given back to the: - DBpedia Mapping community - Wikipedia editorial communities Well, that would be great, but I guess there are not enough human resources to implement these ideas." "DBpedia and Protege" "uHello everyone, I want to download and open DBpedia ontology in Protege 4.3 but I am able to open only TBox (Schema) ontology. How could I see the entire DBpedia Knowlege base in Protege as protege provide some features like obtain a part of the knowledge base of focus on (visual aid). Please let me know how to proceed. Thanking you in advance. Regards, Ankur Padia. Hello everyone, I want to download and open DBpedia ontology in Protege 4.3 but I am able to open only TBox (Schema) ontology. How could I see the entire DBpedia Knowlege base in Protege as protege provide some features like obtain a part of the knowledge base of focus on (visual aid). Please let me know how to proceed. Thanking you in advance. Regards, Ankur Padia. uDear Ankur Padia, a possible approach could look like that: 1.) download the ontology 2.) download the mapping-based types and the mapping-based properties from 3.) merge them in one file (probably easiest by converting the ontology to NT and then appending the other two files to it) 4.) try to load the merged file in Protégé 5.) go to a computer store and buy more RAM 6.) go back to step 4 but I suppose it's not possible on an average computer even with repetitive execution of step 5. Best, Heiko Am 21.02.2014 11:31, schrieb Ankur Padia: uHi Ankur, for starters, have a look at, e.g., Best, Heiko Am 21.02.2014 12:41, schrieb Ankur Padia: uAnkur, you may try one the explorative LOD visualization tools around, e.g.: YAGO (all triples for an entity: RelFinder: www.visualdataweb.org/relfinder.php (only comparatively, e.g. put Newton and Leibniz and look their shared influence network) Aemoo: aemoo.org (pattern-based, shifts between typical and unusual facts) good luck Aldo On Feb 21, 2014, at 1:14:35 PM , Heiko Paulheim < > wrote: uHi Ankur, I also like LOD Live: Hope this helps. Cheers! On 2/21/14, 1:49 PM, Aldo Gangemi wrote: uHello Marco, This tool (LOD Live) is relatively good as compared to others. Thank you everyone for your valuable insights. Regards, Ankur Padia. On Fri, Feb 21, 2014 at 6:49 PM, Marco Fossati < > wrote:" "Clarification AW: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase" "uKingsley Idehen wrote: Should have read: If the Web is becoming a real Graph Model Distributed Database were the Entities in the Database are defined formally via *loosely bound Data Dictionaries*, the utility of Domain and Range should be self-describing and obvious. For starters, higher level user interaction solution pursuits will benefit immensely from exploitation of domain, range, and other aspects of OWL. The association between DBpedia individuals and UMBEL, OpenCyc, and Yago is an example of the above which exists today. Also take a peek at the following via define input:same-as \"Yes\" select ?tp where { a ?tp } The query shows how the loose binding of OpenCyc, UMBEL, and Yago can be used explore DBpedia down a plethora of paths. And this is before delving into leveraging the data dictionaries for higher level UI. Kingsley" "Problems running extraction framework" "uHi all, I've recently been trying to the basic extraction framework running on my PC using the latest SVN revision. However, I've immediately been running into some strange errors Environment: * Windows Vista SP1 * PHP 5.2.9 Full Installation (fresh) * Command line (Powershell) From Jens' instructions in a previous message (Re: Navbox templates) I simply attempted to run extract_test.php from the command line. php -f extract_test.php This unfortunately gives me a host of error messages and warnings. (Specifically, they are a set of \"unable to find library\", a set of \"cannot find module\" and a \"undefined function\" error relating to the parsePage function right at the end.) I've uploaded the entire output of the console from running that command at (it's most just source code of the files being parsed), so hopefully it will make sense to someone. I would very much appreciate some assistance in debugging what would seem to be a quite straightforward task. Thanks, Alex uAlex, All the \"Unable to load dynamic library\" stuff indicates a borked PHP installation. It tries to load all those DLL's because they are listed in some config file (php.ini probably) and it can't find them in the specified location. Fix this first. It also seems that the php.ini setting \"short_open_tag\" is off in your installation. It needs to be on or the DBpedia code won't run. Best, Richard On 13 May 2009, at 01:13, Alex wrote: uHi Richard, I originally suspected the same thing (that the PHP installation had gone horribly wrong), but a complete uninstallation followed by a complete reinstallation hasn't improve the situation at all, I'm afraid. I did however come across a certain troubleshooting page for the issue ( comment out the extensions that we're giving me the errors. However, although I have eliminated the first set of error messages, the \"Cannot find module\" ones persist, and additionally there are some PHP warnings/notices. I've uploaded the new console output to . Setting the \"short_open_tag\" to off ini php.ini seems to fix the last error (at the end of the console output), so that's some progress at least. I appreciate your help in resolving this issue. Regards, Alex Richard Cyganiak wrote: uOn 13 May 2009, at 12:24, Alex wrote: Google says: \"You need the following system environement variable: MIBDIRS=c:\php\mibs (or where ever they are)\" Rename databaseconfig.php.dist to databaseconfig.php and enter appropriate values for your MySQL installation there. Best, Richard uHi Richard, All seems to be working well now in terms of the PHP and DBpedia configuration. It appears that disabling the appropiate extensions (the ones that gave the error as well as the SNMP one) did the job. Unfortunately, my attempts to run extract_test.php are still not succeeding. When I execute the script from the command line, it simply pauses for about 5 seconds, the returns without outputing *anything*. So no errors, but no results either. As I understand, extract_test.php ought to simply dump the RDF triples to the standard out stream, but I'm not seeing anything in the console, databases, or as generated files. Any suggestions as to what might be going wrong? Perhaps there is a guide on how get an extract up and running somewhere? ( the extraction framework, so it's not too helpful here.) Thanks for your assistance with everything. Regards, Alex uNo worries, in that case. I appreciate you assiting me with the general PHP setup anyway. I'll keep having a play around with it myself, though indeed if someone else could give me a few tips here, that would be helpful. Alex Richard Cyganiak wrote: uHallo, Alex schrieb: \"php extract_test.php\" (sometimes \"php5 extract_test.php\" depending on your machine) should give you this in latest SVN: . There are no requirements apart from PHP 5 (command line), an internet connection and Wikipedia not being offline. I just tested it and can confirm that it works. Can you try again? If it does not work, can you make sure that not offline? Kind regards, Jens" "Number of infoboxes" "uI wonder whether one of you good folk could kindly answer a quick question for me, please? How many articles on he English Wikipedia have infoboxes? As of what date? I appreciate that there will be caveats! uOn Thu, Aug 1, 2013 at 2:18 PM, Andy Mabbett < >wrote: Your subject and body ask two different questions. Articles can have more than one infobox. Are you interested in the count of infoboxes or count of articles? Tom On Thu, Aug 1, 2013 at 2:18 PM, Andy Mabbett < > wrote: I wonder whether one of you good folk could kindly answer a quick question for me, please? How many articles on he English Wikipedia have infoboxes? As of what date? I appreciate that there will be caveats! Your subject and body ask two different questions.  Articles can have more than one infobox.  Are you interested in the count of infoboxes or count of articles? Tom uHi Andy, a rough answer can be given using the following SPARQL query: select count(distinct ?x) where {?x dbpprop:wikiPageUsesTemplate ?t . FILTER(REGEX(?t,\"^ which returns around 1.9 Mio infoboxes (without \"distinct\"), and 1.1 Mio pages with at least one infobox (with \"distinct\") (thanks Tom for pointing out that difference). However, I am not sure how many infoboxes do actually follow that naming convention. Best, Heiko Zitat von Tom Morris < >: uThat's consistent with previous analysis I ran directly on Wikipedia EN data dumps. The overall number of infoboxes in Wikipedia EN has consistently been around 45-50% of the total number of article pages (i.e. 4.2M) for the past 3 years. I don't know about other languages. -N. On 8/1/13 12:38 PM, \"Heiko Paulheim\" < > wrote: uOn 1 August 2013 20:38, Heiko Paulheim < > wrote: [Apologies to Heiko Paulheim for sending an earlier reply direct] Thank you, all. The former figure tallies roughly with my estimate on Wikipedia, of \"over 1,951,000\" excluding the further 227,000+ using Template:Taxobox. There are also over 20,000 examples using Template:Geobox and a few less-widely used templates not yet renamed. I'm surprised that there are so many aritcles with two more more infoboxes (assuming few have more than two, the number would be about 400,000). uThere are a few other infoboxes that are not named \"Infobox \" listed here: (The URL ends with a \":\" colon.) But if a number +/- 10% is OK for you, they probably don't matter. JC On 2 August 2013 01:02, Andy Mabbett < > wrote:" "Extracting region/state information for populated places" "uHello everyone, I am a bibliometrics PhD and I am a newbie to dbpedia. I have been trying to get region info through Wiki infoboxes. I wrote the query as below: SELECT DISTINCT * WHERE { ?city rdf:type dbpedia-owl:PopulatedPlace . ?city rdfs:label ?label. ?city dbpedia-owl:abstract ?abstract . ?city dbpedia-owl:country ?country . ?country dbpprop:commonName ?country_name. OPTIONAL { ?city dbpedia-owl:isPartOf ?partOf } . OPTIONAL { ?partOf dbpprop:name ?partOfname } . OPTIONAL { ?city dbpprop:region ?region }. OPTIONAL { ?city dbpedia-owl:postalCode ?postal_code }. FILTER ( lang(?abstract) = 'en' && lang(?label) = 'en' && regex(?country, \"Germany\") && regex(?label, \"Bonn\")) I realized that there is no region information for some places in their related dbpedia pages even though the region info appears in the original Wikipedia page. For example, for Bonn of Germany contains state info (North Rhine-Westphalia). However, in its dbpedia page I could not find this information. The same issue is valid for Leuven, Belgium. Actually, the region information appears only in the abstract part. Am I missing something? If you can guide me, it would be great. Thanks in advance. Kindest regards, Mehmet uDear Mehmet you can improve / increate the data that are extracted from Wikipedia by better mapping the infobox to the DBpedia ontology [1] [2] This is a crowdsourced process that needs regular updates due to changes in Wikipedia templates Once you improve the mappings, DBpedia Live [3] will reflect all the changes (within a short time) but dbpedia.org will be updated on the next static release. Best, Dimitris [1] [2] [3] On Thu, Jan 23, 2014 at 12:45 PM, Mehmet Ali Abdulhayoglu < > wrote: uThank you Dimitris. These links are really helpful. Best regards, Mehmet From: Dimitris Kontokostas [mailto: ] Sent: Thursday 23 January 2014 12:17 PM To: Mehmet Ali Abdulhayoglu Cc: Subject: Re: [Dbpedia-discussion] Extracting region/state information for populated places Dear Mehmet you can improve / increate the data that are extracted from Wikipedia by better mapping the infobox to the DBpedia ontology [1] [2] This is a crowdsourced process that needs regular updates due to changes in Wikipedia templates Once you improve the mappings, DBpedia Live [3] will reflect all the changes (within a short time) but dbpedia.org will be updated on the next static release. Best, Dimitris [1] [2] [3] On Thu, Jan 23, 2014 at 12:45 PM, Mehmet Ali Abdulhayoglu < > wrote: Hello everyone, I am a bibliometrics PhD and I am a newbie to dbpedia. I have been trying to get region info through Wiki infoboxes. I wrote the query as below: SELECT DISTINCT * WHERE { ?city rdf:type dbpedia-owl:PopulatedPlace . ?city rdfs:label ?label. ?city dbpedia-owl:abstract ?abstract . ?city dbpedia-owl:country ?country . ?country dbpprop:commonName ?country_name. OPTIONAL { ?city dbpedia-owl:isPartOf ?partOf } . OPTIONAL { ?partOf dbpprop:name ?partOfname } . OPTIONAL { ?city dbpprop:region ?region }. OPTIONAL { ?city dbpedia-owl:postalCode ?postal_code }. FILTER ( lang(?abstract) = 'en' && lang(?label) = 'en' && regex(?country, \"Germany\") && regex(?label, \"Bonn\")) I realized that there is no region information for some places in their related dbpedia pages even though the region info appears in the original Wikipedia page. For example, for Bonn of Germany contains state info (North Rhine-Westphalia). However, in its dbpedia page I could not find this information. The same issue is valid for Leuven, Belgium. Actually, the region information appears only in the abstract part. Am I missing something? If you can guide me, it would be great. Thanks in advance. Kindest regards, Mehmet" "DBpedia Live changesets stuck" "uHi Dimitris The dbpedia live change sets have been stuck for some while at Nov 12th Any updates on when this is going to start working again. thanks Paul Hi Dimitris The dbpedia live change sets have been stuck for some while at Nov 12th Any updates on when this is going to start working again. thanks Paul uOn 2/3/15 5:24 AM, Paul Wilton wrote: Dimitris, I am increasingly confused as to why this is happening. Why is MySQL playing a pivot role in a workflow for which it is ultimately ill-suited, as demonstrated by the the current state of affairs? uHi Paul , Kingsley, We have some server issues that we are trying to solve and Live will be hopefully be up again by mid/end-February. Best, Dimitris On Tue, Feb 3, 2015 at 6:02 PM, Kingsley Idehen < > wrote: uDon't know if \"stuck\" is the reason, but see this: - How many newspapers in dbpedia.org? select count(*) {?x a dbpedia-owl:Newspaper} -> 6043 (as ofhmmAug 2014: pretty close to wikidata: 6275) - How many in live.dbpedia.org? select count(*) {?x a dbpedia-owl:Newspaper} -> 2583. OOPS. BTW Kingsley, the dbo: prefix is gone. Cheers! uHi Dimitris Any progress on this ? thanks and kind regards Paul On Wed, Feb 4, 2015 at 1:52 PM, Dimitris Kontokostas < > wrote: uHi Paul, all Live is running again. For now I disabled abstracts until we get up-to-date with the latest wikipedia updates I adapted a bit more the diff format and split it in 4 files to make it compatible with external tools Main diff (like before) * added -> triples to add * removed -> triples to delete the following 2 are for clean updates (optional to execute) * reinserted -> unchanged triples that can be reinserted * clear -> delete queries that clear all triples for a resource and the proper order for execution is: removed, clear, reinserted, added As mentioned earlier the clean updates are meant to cover bugs in different stages of the update process and are optional Since I recovered from an older (~1 month) backup I marked all records to perform a clean update. After all records are processed clean updates will be performed after every 5 extractions of a wikipedia page The live-mirror tool is updated as well On Tue, Mar 3, 2015 at 5:50 PM, Paul Wilton < > wrote: uGreat thanks I have some questions. If we have mirror running based on the 30th Sept 2014 dump, and were mirrored up to the Nov 24th last update before it broke, how do I use the files here to bring it up to date now ? What is the purpose the 14Gb 2014 tar.gz file ? (it looks clear that our current updater is not going to cope with that, as you have deviated from the file/disk structure) and will the abstracts eventually be brought up to date too (so no gaps) - these are important to us thanks Paul On Mon, Mar 9, 2015 at 1:33 PM, Dimitris Kontokostas < > wrote: uOn Mon, Mar 9, 2015 at 4:02 PM, Paul Wilton < > wrote: If you use the clean update diffs you should recover from the mis-alignment between the restored backup and the diffs generated after the backup date no, this is just a diff dump for people who want to download the whole 2014 changeset locally Yes, we need change some infrastructure to properly support them but will be back soon. I suggest you use the abstracts from the static dumps in a separate graph until we tackle this. FYI, We now plan to provide two static releases per year (April & October) uOkay so when you say the clean diffs, there is a big gap between Nov 12th and March 6th ? How is this gap filled in ? On Mon, Mar 9, 2015 at 2:13 PM, Dimitris Kontokostas < > wrote: ulike 30 minutes ago we processed up to 2014-11-16 22:42:24 so it should catch up in a few days On Mon, Mar 9, 2015 at 4:20 PM, Paul Wilton < > wrote: uhey Dimitris just checking in on the live updates - its been a few days, but I don't see any of the gaps being filled between November 11th and March here : ?? thanks Paul On Mon, Mar 9, 2015 at 10:30 AM, Dimitris Kontokostas < > wrote: uHi Paul, The changesets are not sequential with modified date but with extraction from Live so we move to the 2015 folder now Dimtiris On Fri, Mar 13, 2015 at 5:38 PM, Paul Wilton < > wrote: u0€ *†H†÷  €0€1 0 +" "Issue with the live datastream" "uHi Mohamed, I would like to know what timestamp the live stream is actually using to fetch from wikipedia. There are articles from 4 October 2011 that are in Wikipedia but not yet in the live stream e.g.: which cannot be found on either or Furthermore when i look at the live statistics page on: and i check some of the wikipedia articles it claims it has updated e.g.: and i check this page history, i can see the last modification to this page is from 9 june 2010. Same goes for modified on 21 August 2011. Patrick uHi Patrick, On 10/04/2011 03:34 PM, Patrick van Kleef wrote: Yes you are right, I've also noticed that, the live stream may get blocked for sometime as after a while, DBpedia got updated. That is normal behavior, as we also have 2 other feeders, one for the pages affected by a mapping change, and another one for the pages which are not modified for a long time. So, those pages could be inserted in the queue for processing through any of those feeders." "GSoC 2016: Inferring Infobox Template Class Mappings From Wikipedia and WikiData" "u*Mentor: Nilesh ChakrabortyStudent: Peng Xu* In the first phase of the project, I complete a approach to find new mappings for languages with few existing mappings based on languages with enough mappings and cross-language links. This approach can achieve fairly good evaluation results. And I get 456 high-quality new mappings for Chinese after my manual check. Due to the incomplete coverage of mapping on DBpedia, we need to predict ontology types for instances without a type. In the second phase, I try different methods including tensor factorization and graph embeddings on DBpedia to do type prediction. The experiments show that tensor factorization can achieve good performance on small languages like Bulgarian. However, for larger languages it performs badly due to the limit of memory. All the scripts I wrote can be easily applied to other languages if datasets are downloaded to the proper path. All my code and detailed documents can be found here: Further work: Currently, I ignore the literals in DBpedia when doing tensor factorization. The next step is to add the literal information. Furthermore, considering the issues about time and memory complexity, a distributed implementation of the algorithm can be useful if we want to apply the ideas on languages like English. Best Regards Peng Xu uGreat work Peng Xu & Nilesh! @everyone, the Chinese results look quite good, the ones with the English names at least, I couldn't tell the rest :) We want to apply this to other languages with low on no mappings at all but we need native speakers for the evaluation Any volunteers? If we hurry a bit we will use these mappings in the upcoming release Cheers, Dimitris On Tue, Aug 23, 2016 at 12:31 AM, Peng Xu < > wrote:" "how to get IntermediateNodes ("sub-objects") of a resource" "uOn Saturday, 22 November 2014 12:53:08 UTC+2, Daniel Fleischhacker wrote: Unfortunately DESCRIBE doesn't return the IntermediateNodes (\"sub-objects\") of a resource. Eg DESCRIBE for Berbatov ( dbpedia:Dimitar_Berbatov dbpedia-owl:careerStation dbpedia:Dimitar_Berbatov2 , dbpedia:Dimitar_Berbatov6 , dbpedia:Dimitar_Berbatov1 , dbpedia:Dimitar_Berbatov4 , dbpedia:Dimitar_Berbatov9 , dbpedia:Dimitar_Berbatov3 , dbpedia:Dimitar_Berbatov7 , dbpedia:Dimitar_Berbatov5 , dbpedia:Dimitar_Berbatov10 , dbpedia:Dimitar_Berbatov8 ; but doesn't return the info about these careerStations (matches, goals, team, shirt number, etc). DESCRIBE usually returns a CBD or Symmetric CBD, where \"sub-objects\" are marked by being blank nodes. I hate blank nodes (hard to point out or debug) and agree it's better to have them with URLs as shown above. But this makes them sort of hard to recognize. Messing with URLs is usually a bad idea, but I think in this case it's the only way. Kingsley, can you possibly hack this Virtuoso instance (or maybe what's called \"dbpedia VAD\") to do this: When doing DESCRIBE ?s: Get ?s ?p ?o. str(?o) matches \"\d+$\" then add ?o ?x ?y to results." "Fact ranking game (creation of ground truth)" "u[Apologies for cross-posting. Please redistribute within your own group or among colleagues, thank you!] In recent years we have experienced an exponential increase of knowledge available on the web. Having such an extensive amount of information poses a challenge when trying to identify globally relevant or important facts. In effort to address this issue, several fact ranking systems have already been developed. However, the difficulty with current state-of-the-art systems is that there is no gold standard corpus which could serve as a ground truth for evaluating their performances. We have developed a fun and exciting quiz which will help us to tackle this issue by using the wisdom of the crowd. While you are playing, your inputs (which are greatly appreciated) will contribute to our scientific experiment. Together, we will build a first corpus that will help our scientific community in evaluating different fact ranking strategies! Here you will find our tool that is used to rank facts about ~500 popular entities from Wikipedia. You have to register with the tool and then the task will be explained to you in detail. You might interrupt your rating of the presented facts any time you like and continue later. To make it a bit more interesting, you will be able to score points and see your ranking in a highscore list. We would really appreciate your help in this task. Please do also spread the word. The more participants, the more valid our ground truth will be. Thanks and best regards, Semantic Technologies Team" "Junior Researcher Position / Ph.D. Position in Semantic Web Technologies" "uJob Offer: Junior Researcher / Ph.D. Position 
starting February 2014 at the Hasso-Plattner-Institute (HPI), Potsdam. HPI is Germany’s university excellence center for IT Systems Engineering affiliated with the University of Potsdam, Germany. The Institute's founder and benefactor Professor Hasso Plattner, who is also co-founder and chairman of the supervisory board of SAP AG, has created an opportunity for students to experience a unique education in IT Systems Engineering in a professional research environment with a strong practical orientation. Research at HPI is characterized by scientific excellence and close cooperation with industry.
(for more information about HPI, cf. Project Description: D-WERFT (engl. ‘Digital Dockyard‘) is a national research project funded by the German Federal Ministry of Education and Research, with the goal to integrate the value creation process of the media industry with the help of semantic technologies and Linked Open Data. This process includes media planning and production, media post-production, media distribution, media archival, and digital rights management. The project is supported by 10 German project partners from industry, education, and research, and is scheduled for 3 years starting in 2014. Within D-WERFT, HPI ist responsible for Linked Data based data integration, semantic multimedia analysis and retrieval. Main research topics: - Linked Data engineering - named entity recognition and mapping - event detection - ontological engineering - knowledge engineering and knowledge mining - recommendation systems - semantic and exploratory search - multimedia analysis including" "Instance from dbpedia" "uI made a small video of how to extract Capitals Of Countries as RDF with the visual tool I mentionned in a previous mail. You can have a look at it here: I hope it helps for the data extraction part of your process. On Thu, Mar 24, 2016 at 6:04 PM, kumar rohit < > wrote:" "Bad SPARQL results from DBPedia endpoint?" "uHello Gregory, On Mon, 2010-07-12 at 14:22 -0400, Gregory Williams wrote: It looks like something had corrupted the database. The query SELECT ?g ?s ?o ( (?g)) ( (?s)) ( (?o)) WHERE { graph ?g { ?s a ?o } } shows that rows with empty ?g or ?s contains IRI IDs with reasonable values but the database does not contain string IRI values that correspond to that IDs. Maybe some application forcibly committed ill data instead of rollback on some error handler in hope that partial data about something is better than nothing uGregory Williams wrote:" "Page #'s, Revision #'s and PND?" "uI've noticed a few new files show up in dbpedia 3.5. These include the \"page number\", \"revision number\" and \"PND\" tables? What exactly are these? I know that freebase keys things to wikipedia by a \"page number\" that seems to be something internal to the mediawiki system Is that what the page number is here? BTW, the freebase mappings look GOOD, much much better than what's been in previous dbpedias In particular, these are real NT files that point to real linked data URLs. I'll have a more definite view once my scripts finish overnight, but it looks like the freebase mappings pretty much match what can be derived from the \"Simple Topic Dump\" that FB releases, and for all I know, that's where they come from. If the mappings look OK I might be soon releasing an NT file that specifies rdf:type between dbpedia resources and Fbase types. (If you had an rdf dump of Freebase, this would be equivalent to what you'd get with the owl:sameas statements declared in the freebase mappings.) I'm also thinking about improved rdf:type statements to the dbpedia ontology as well Definitely many more Person(s) can be found in fbase A really good answer to \"City\", \"Town\", etc. might need to wait until I've got a certain feedback loop established. uPaul Houle wrote:" "Querying for ontology definitions in SPARQL" "uHi, does the SPARQL endpoint have the recent ontology loaded? From the definition of dbpedia-owl:country, is see that it is a subproperty of dul:hasLocation: However, SPARQLing for subproperties of dul:hasLocation gives me zero results: select ?x where {?x rdfs:subPropertyOf } Any ideas? Thanks, Heiko u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uHi Rumi and Kingsley, thanks for the advice and clarification, it works well! Best, Heiko Am 17.10.2014 um 13:39 schrieb Kingsley Idehen:" "instance_types and infoboxes" "uHi, thanks for helping me with my previous question. I do now understand that the classification of persons in the instance_types file is done by using the infobox type. So for example all persons with an infobox of the type \"writer\" are classified as writers. Oddly there are many persons classified as writers whose articles don't contain infoboxes at all. A few examples: - Richard Zenith - Carolyn Graham - Alvilde Prydz - N. J. Crisp How come instance_types tagges them as writers? Is there another way for classification than using the infobox type? Thanks again, Christoph uHi Christoph, that is correct but it is not the complete truth. The classification is done based on Wikipedia templates (infoboxes are a special type of template) that are used in the article of the person. There is no restriction of the template types to visible infoboxes. This Wikipedia article contains a template called US-writer-stub which is mapped in DBpedia to the class Writer, see [1]. The other instances you mention seem to have stub templates assigned, too. Thus, to more precisely see where a certain instance type is coming from, you have to look at the source code for all templates used in the article and see if some of these templates are mapped in the mappings wiki. Cheers, Daniel [1] uAlso: as Volha explained on 12/9/2014 12:02 PM, there is a file \"mapping-based-types-heuristic\": See sample at And on 12/11/2014 2:59 PM I gave an example that the extractor can make \"stub resources\" even when there's no wiki page, but the person is mentioned in an appropriate list. E.g. from a football team roster, it infers FootballPlayers, with classes & name, and position (jersey number, team). E.g. from and the redlink \"Ryan Sappington\", it infers - stub person: - position: Fredericksburg_Hotspuryan_Sappington1" "Dutch mappings" "uHi, I would like to contribute and start making mappings for the Dutch language domain. Can I get user rights for Mappings (nl) ? And what about adding local labels (rdfs: ) to classes and properties in the ontology; can I do that too? Thanks, Roland Cornelissen uHi Roland, what's your user account on I'll add you to the editor group. You can then edit any mappings in any language. Yes, you can add labels in all languages to classes and properties. Regards, Christopher On Mon, Feb 27, 2012 at 22:35, Roland Cornelissen < > wrote: uHi Christopher, My user name is Roland. Thanks! Roland On 02/28/2012 10:50 AM, Jona Christopher Sahnwaldt wrote: uDone! Happy mapping! On Tue, Feb 28, 2012 at 11:15, Roland Cornelissen < > wrote:" "Querying datasets" "uHello, I've downloaded the datasets from the dbpedia website, and now I'm trying to create a system that allows me to query the data. I've come across things like Jena, but I'm not sure how well it will scale - I've had an import running for about a day and it's nowhere near done! What do you recommend for doing this? What does the dbpedia site use as it's back end? Carl uHi Carl, We are using Virtuoso as backend, which works fine. Some resources that might help you choosing the right backend are collected at the end of the ESW Wiki page about RDF benchmarks Christian Becker, a student at FU Berlin who is currently writing his master thesis about using DBpedia data in a mobile context has also run some benchmarks with DBpedia data. He will put a link to his results in the Wiki over the week-end. Cheers Chris uMy results are here: benchmarks-200801/. Feedback is appreciated! Cheers, Christian uHi Christian, very interesting results. Your heavy use of filters seems to put all stores in serious trouble. I'd suggest to add Jena to the comparison, there've been some effort in query optimization by Markus Stocker (cc'ed). Markus, could you help analyzing the results? Also, I would suggest to use a \"x ?p ?o. ?s ?p x.\" query instead of just \"x ?p ?o.\" in 6.1, as there are no inverseProperty statements in DBpedia. But compared to your other results, that seems to be an insignificant detail Cheers, Georgi uSorry, you have SDB in your comparison. My mistake Georgi uChristian Becker wrote: Christian, Do you have .rq files for the queries used in your query benchmarks? Kingsley uHi, Thanks! I'd be glad to run a few more queries - the hard part was getting everything up and running, and waiting for the load to complete :) Does anybody have queries to suggest, too? I will collect them and run them in a couple of weeks or so. Cheers, Christian uKingsley, I attached them and put them on the website as well. Cheers, Christian ‹ uChristian Becker wrote: Thanks, I will take a look at these. uChristian Becker wrote: Christian, Virtuoso is a Quad Store as opposed to a Triple Store. We use Named Graphs to partition Triples within the DBMS. Thus, you need to identify the target Graph (via it's URI) in your queries by: 1. Using the graph-uri parameter in SPARQL Protocol Queries 2. FROM Clause in SPARQL statements 3. Use the Virtuoso's INI \"[SPARQL]\" to set the Default Graph to a specific Graph Name 4. Specific scoping of SPARQL patterns to specific Graphs when joining across graphs Assuming you have DBpedia data in an internal Graph Named: (*which is what we have in the live instance for example): Query 1: SELECT ?p ?o FROM WHERE { ?p ?o } Kevin Bacon: PREFIX p: SELECT ?film1 ?actor1 ?film2 ?actor2 FROM WHERE { ?film1 p:starring . ?film1 p:starring ?actor1 . ?film2 p:starring ?actor1 . ?film2 p:starring ?actor2 . } There are even more tricks to come, but let's get the very basic stuff in place as your benchmarks currently do not reflect Virtuoso accurately. At the same time, this isn't your fault as we need to do a much better job of making the usage scenarios and associated optimizations much clearer in our usage documentation etc Kingsley uHi all, thanks to feedback from OpenLink I modified Virtuoso's indexing configuration for my benchmark. The results are much more favorable for Virtuoso now, and there's no need for graph indication. Anyway, I think you just won my benchmark ;) The URL is: may need to refresh the page in your browser in order to load the new charts). Cheers, Christian On Jan 14, 2008, at 3:29 PM, Kingsley Idehen wrote: uHi Christian Seems pretty good! however would it be possible to rework the horizontal columns of your graph? Because a 4sec answer time for Virtuoso looks like just \"lower\" than a 900sec for Posgre. Dunno what the modification could looks like, but I think something should be done because it is really missleading to the eyes if you don't read the real time. Just my two pennies :) Take care, Fred uHi Christian, thanks for sharing your results. I think it would be a good idea to run the database optimization Andy suggested. It feels a bit unfair to compare an optimized Virtuoso setup with an out-of-the-box SDB setup. However, something seems to be strange. I ran query 6.3 on seconds for the repeated second run. Are you sure that you did not step into a query cache trap? If not, could we use your server for the public DBpedia sparql endpoint in the future;) Cheers, Georgi uChristian Becker wrote: Christian, Great! This exercise helped remind us about some critical potholes in collateral collection for Virtuoso :-) Also note, that Virtuoso 5.0.3 already exists, and the 6.0 Cluster Edition is exiting R&D; as I write. In the 6.0 release we are planning to up the ante significantly re. Clustered deployment configurations. Thus, I would encourage you to consider follow-up benchmarks that cover: 1. Scalability (i.e. simulations covering concurrent clients performing a mix of queries) 2. Scalability (i.e. simulations covering concurrent clients performing a mix of queries against a variety of cluster configurations) 3. All of the above as part of a Linked Data Driven Business Intelligence benchmark (we are contributing a TPC-H derivative for Linked Data based on SPARQL-BI extensions) 4. All of the above against real and virtual (SQL-RDF mapping) Linked Data sets Kingsley uFred, you're right - as Virtuoso is no longer an outlier, there's no need for log scales anymore :) I updated the charts now. Thanks, Christian On Jan 16, 2008, at 2:14 PM, Frederick Giasson wrote: uHi Christian, Visually much better now :) (for everybody, not only Virtuoso). Thanks! Take care, Fred uHi again Also, another idea (not sure it will be better that way thought): Why not sorting the horizontal columns on time basis instead of \"group of names\". By this I mean: the better results at top, and the worse (the longer slide) at the bottom. It will be even better, and will gives some sense of \"grading\" to the graph from top to bottom. Could deserves a test; dunno. Take care, Fred uHi Georgi, I know - unfortunately, I have extremely limited time to do this at the moment. It's very unfair to leave it as it is, but I have no choice. Andy suggested a RAM upgrade in part, which definitely makes sense, but would require me to re-run all the tests. I see this happening for future tests. I would love to optimize the databases, but optimizing MySQL is a science of its own, and I don't have much experience with PostgreSQL. If a PostgreSQL ANALYZE is all it takes, I will do that - but I'm sure there are other parameters. If you come up with an optimized configuration, I'd be happy to implement it. Alternatively, I could also give you ssh access. That's why I always restart the server for each run, and have a fixed order of running the queries. I do this at least twice (although I only take the first result), and my results have been very consistent so far. I'm not sure about the indices on the DBpedia endpoint, but my guess is that they don't have the optimized indexing configuration. Without optimizations, query 3 took 112 seconds on my machine. Also note that I don't have all DBpedia sets loaded, and I don't know what else runs on the public endpoint. Thanks, Christian uGeorgi Kobilarov wrote: Georgi, Do you see the missing indexes as a special optimization? As I said, we are guilty of not documenting index usage scenarios with the kind of clarity required by time challenged evaluators etc. Of course, if there are are optimizations that others have we put them all in the pot and re-run. I would like to believe that Christian's benchmarks factor this in. But I'll leave this to Christian to confirm. Christian: Why not publish the actual benchmark program source code? This is what we do with all our benchmarks. Even better, if you are interested, we can provide a mechanism for you to add this to our Open Source Benchmark Tool [1]. I don't think Christian is trying to imply that you can run a live instance of DBpedia on I gig of RAM setup (amongst other things). This is exactly why I made my comments about a scalability variant of the benchmarks. Also, it would be nice if there were many DBpedia mirrors, this would enable a variety of scalability benchmarks based on real world usage scenarios (the thing we spend most of our time working on). Thus, it would be nice if you guys could bring an ARQ based DBpedia instance on line, especially as this would offer dual benefits beyond benchmarking RDF Store scalability. Links: 1. Kingsley uChristian Becker wrote: Correct, and we will now apply them :-) Kingsley uHi Christian, As you wish, was just an idea :) Take care, Fred uHi Fred, hmm I think in this case I prefer a consistent order - that way, the configurations are grouped by platform, which especially useful when you want to differentiate \"spoc, posc, opsc\" from \"spoc, posc, ospc\" :) Thanks, Christian On Jan 16, 2008, at 3:38 PM, Frederick Giasson wrote:" "GSoC - ListExtractor" "uHi, I am the GSoC participant for the List Extractor project and I would like to introduce myself (as suggested in the Slack channel). My name is Federica Baiocchi and I am a master’s student of Information Engineering from Università Politecnica delle Marche, Italy. My project will focus on the extraction of relevant information from lists in Wikipedia pages and on how to transform those valuable data in a RDF format to expand DBpedia knowledge base. I plan to start from pages either in English or in Italian concerning a specific domain and containing lists structured in a template; then I can try to expand my solution with different topics and/or a different language. The main aim is to pull out as much usable data as possible and to reach modularity and scalability for future generalization. I am looking forward to work with DBpedia and I will accept any kind of advice. Thank you for your attention, Federica Baiocchi. Hi, I am the GSoC participant for the List Extractor project and I would like to introduce myself (as suggested in the Slack channel). My name is Federica Baiocchi and I am a master’s student of Information Engineering from Università Politecnica delle Marche, Italy. My project will focus on the extraction of relevant information from lists in Wikipedia pages and on how to transform those valuable data in a RDF format to expand DBpedia knowledge base. I plan to start from pages either in English or in Italian concerning a specific domain and containing lists structured in a template; then I can try to expand my solution with different topics and/or a different language. The main aim is to pull out as much usable data as possible and to reach modularity and scalability for future generalization. I am looking forward to work with DBpedia and I will accept any kind of advice. Thank you for your attention, Federica Baiocchi. uThanks Federica for the intro, The codebase will be available here: and the weekly report here: Cheers, On 5/9/16 10:53, Federica Baiocchi wrote: uThanks Federica! Looking forward to seeing the results of your project Cheers, Dimitris On Mon, May 9, 2016 at 11:53 AM, Federica Baiocchi < > wrote:" "which version data set you used?" "uHi, I noticed, such one in Why that happened?" "Odd entries in instance-types_de.ttl.bz2, instance-types-en-uris_de.ttl.bz2, and instance-types-en-uris_ru.ttl.bz2" "uThe file contains triples which I cannot seem to find in any online version and which appear to be wrong, for example: < Which says that Barack_Obama is a quote, which he isn't :) There is also the triple: < There are similar odd entries in the file instance-types-en-uris_de.ttl.bz2, for example < http://dbpedia.org/ontology/Quote> In the russian file instance-types-en-uris_ru.ttl.bz2 I find the following triple: < http://www.w3.org/1999/02/22-rdf-syntax-ns#type> < http://dbpedia.org/ontology/Book> which is also wrong. Where do all these incorrect entries come from, Wikipedia does not seem to contain any of these? Since Barack_Obama is a rather prominent entry (so one would assume that errors in Wikipedia or dbpedia should get caught quickly), I am worried about the information that may be in there for less prominent entries. Many thanks, Johann The file http://downloads.dbpedia.org/2015-04/core-i18n/de/instance-types-en-uris_de.ttl.bz2 contains triples which I cannot seem to find in any online version and which appear to be wrong, for example: < http://de.dbpedia.org/resource/Barack_Obama > < http://www.w3.org/1999/02/22-rdf-syntax-ns#type > < http://dbpedia.org/ontology/Quote > Which says that Barack_Obama is a quote, which he isn't :) There is also the triple: < http://de.dbpedia.org/resource/Angela_Merkel > < http://www.w3.org/1999/02/22-rdf-syntax-ns#type > < http://dbpedia.org/ontology/Quote > There are similar odd entries in the file instance-types-en-uris_de.ttl.bz2, for example < http://dbpedia.org/resource/George_W._Bush > < http://www.w3.org/1999/02/22-rdf-syntax-ns#type > < http://dbpedia.org/ontology/WrittenWork > < http://dbpedia.org/resource/Angela_Merkel > < http://www.w3.org/1999/02/22-rdf-syntax-ns#type > < http://dbpedia.org/ontology/Quote > In the russian file instance-types-en-uris_ru.ttl.bz2 I find the following triple:   < http://dbpedia.org/resource/Barack_Obama > < http://www.w3.org/1999/02/22-rdf-syntax-ns#type > < http://dbpedia.org/ontology/Book > which is also wrong. Where do all these incorrect entries come from, Wikipedia does not seem to contain any of these? Since Barack_Obama is a rather prominent entry (so one would assume that errors in Wikipedia or dbpedia should get caught quickly), I am worried about the information that may be in there for less prominent entries. Many thanks,   Johann uHi Johann, If you look into the source code of the pages, for instance, the Russian one for Obama, you find out that the list of Obama's book is defined with \"книга\" template (the translation is \"book\") - search for \"{{книга\" in the source code. And - guess what - there is a mapping for the infobox with the same name, to the Book class of the DBpedia ontology The extraction framework just doesn't know which one of the infoboxes on the page is \"the main one\" - there actually seems to be no standard way to specify this in the Wiki page code. Some heuristics can be used - roughly, finding anything starting with \"{{\" and trying to treat is as a source of the DBpedia type - and this is exactly how you get the strange types. Cheers, Volha uHi Volha, thank you for this explanation! I can now understand why this happened, but I am still hoping that there might be some way of fixing this, because it seems to add considerable noise (isa book and isa person should actually be mutually disjoint and then any ontology containing both facts would be inconsistent). What I also found is that although the downloaded files for the German dbpedia do contain Angela Merkel being a quote, the online version does not show this type for Angela Merkel. So it appears that at least in the online German version, there has been a way to clean this or avoid this? Strictly speaking, all facts from all languages for the same entity (URI) should be consistent, and given the OWA, combinging the facts from multiple languages should improve the KB, so some heuristic way of choosing between the types implied by multiple templates may be a better solution? Is it ever possible for the types inferred from multiple templates to be compatible and correct? My initial assumption would be that templates like \"book\" or \"quote\" in these cases always appear after the main template in the article? What I find particularly worrying about this errors is that they occur for rather important entities, so anyone looking up Angela Merkel in an ontology derived from these files will find that she is-a quote BTW, I had a look at the DBpedia ontology and it seems there are a number of disjointWith axioms in there e.g. person is disjoint with building and with escalator, but book is not defined to be disjoint with anything, it appears. Having such definitions (or simply descriptive object properties which define which types are incompatible without actually causing inconsistency) may help to detect such problems in the first place? Thanks Johann On 12 October 2015 at 16:05, < > wrote: uDear Johann, On Mon, Oct 12, 2015 at 6:57 PM, Johann Petrak < > wrote: The problem with multiple mapped templates is that, depending on the mapped class, each template may creates a unique IRI and the first template takes the main resource and the rest subsequent e.g. Merkel would be in one of those Maybe the German chapter can answer this but they might use a different dataset (older /newer) that does not have this mapping Heuristics are great are we are using them in many places but the problem needs to be fixed at the source. Doing heuristics on errors results in errors again If everyone who identified an error tried to fix the mapping that caused it DBpedia would be a lot better. => we need a more active mapping community and btw, today we announced a mapping sprint for the next release Should be but noone can guarantee that. We already use disjoint axioms for cleaning the data so the more (good quality) disjoint axioms people add the better Best, Dimitris" "Constructing the right SPARQL query" "uHello everyone. I am looking into getting the list of the most influential people in history. To achieve this, I am going to be using this query as shown below. Now, I modify the query to get by adding 'influencedBy' to get a second table. Is there not a decent way to get these two tables queried so we retrieve a uni-dataset. Thank you. SELECT * WHERE { ?p a . ?p ?influenced. } uHi Ali, To combine two (or more) sets of conditions for a query, use the UNION operator. See for a similar usecase of DBpedia this question on StackOverflow (usually a better place to ask programming questions, by the way): Groeten van Ben On 21 December 2013 18:18, Ali Gajani < > wrote: uThanks for your prompt reply Ben. I have on StackOverFlow but I am interested in some immediate expertise as I am entirely new to SPARQL. I am going to get a directed graph plotted on Gephi, just saying, so I did this, the SPARQL query below. Now, do you think it makes senses because the two tables individually showcased influenced and influencedby and hopefully the UNION operator does not hurt the truth in that knowledge by combining these two. Just take a look at it carefully because even thought the UNION operator worked, I want the truth to be preserved in data. SELECT * WHERE { ?p a . { ?p ?influenced. } UNION { ?p ?influencedBy.} } On Sat, Dec 21, 2013 at 5:57 PM, Ben Companjen < >wrote: uOn Sat, Dec 21, 2013 at 1:04 PM, Ali Gajani < > wrote: With a query like this, I'd expect that both properties would be used to find a match for ?influencer and ?influenced. E.g., select distinct ?influencer ?influencee where { { ?influencer dbpedia-owl:influenced ?influencee } UNION { ?influencee dbpedia-owl:influencedBy ?influencer } } limit 50 Of course, this can be done even more simply with property paths: select distinct ?influencer ?influencee where { ?influencer dbpedia-owl:influenced|^dbpedia-owl:influencedBy ?influencee } limit 50 //JT uFirstly, thank you so much for helping me discover new knowledge (for myself) through this beautifully crafted SPARQL. This makes a lot more sense that you have used the DISTINCT keyword to actually eliminate any duplicates, but will this query ensure the truth is captured in one table in the same way as the individual two tables did, because I want to make sure I can use this dataset to count indegrees (high influencers) properly. It is impossible to survey all the rows to ensure the knowledge is true, but I am asking anyway. On Sat, Dec 21, 2013 at 7:00 PM, Joshua TAYLOR < >wrote: uOn Sat, Dec 21, 2013 at 2:24 PM, Ali Gajani < > wrote: If that's what you're trying to measure, then you could use a query like select ?influencer (count(distinct ?influencee) as ?numberOfInfluencees) where { ?influencer dbpedia-owl:influenced|^dbpedia-owl:influencedBy ?influencee } group by ?influencer order by desc(?numberOfInfluencees) limit 100 to find out how many distinct influencees each influencer had. (Karl Marx is pretty influential.) uThanks, I get this table is pretty cool (and yeah, I thought Aristotle would be first but I was wrong) but I'd like to keep my table in this form: *Influencee : Influencer *so I can actually plot it in Gephi and then do the rest. I think the earlier query you mentioned does that, however, my question was, does it capture the truth as the two earlier individual tables did individually? I just wanted to be sure it does. Moreover, my it would be nice to modify the query to include persons as I did in my Question originally, because at this stage, it gives you stuff like \"Java Programming Language\", which is funny as I'd not like it to be there, specially for my analysis of the 'most powerful men in history'. Thanks Josh. On Sat, Dec 21, 2013 at 7:33 PM, Joshua TAYLOR < >wrote: uOn Sat, Dec 21, 2013 at 2:38 PM, Ali Gajani < > wrote: Yes, you're getting the same results as you would from two separate queries, or from the query using the union (modulo the removal of any duplicate results using `distinct`). You can make sure that the influencer and influencee are both people by using the same thing you had before: checking for a ?i rdf:type dbpedia-owl:Person triple: select * where { ?influencer dbpedia-owl:influenced ?influencee . dbpedia-owl:Person ^a ?influencer, ?influencee . } uOn Sat, Dec 21, 2013 at 2:24 PM, Ali Gajani < > wrote: Presumably you mean out-degree if you're talking about influencers. A simple count doesn't sound like it'll capture the real influence. Even if you assume that Wikipedia has a comprehensive and unbiased coverage of influential people (almost certainly not true), shouldn't influencing someone influential count more? That would imply you need to do a page-rank style aggregation of link weights. Tom On Sat, Dec 21, 2013 at 2:24 PM, Ali Gajani < > wrote: I want to make sure I can use this dataset to count indegrees (high influencers) properly. It is impossible to survey all the rows to ensure the knowledge is true, but I am asking anyway. Presumably you mean out-degree if you're talking about influencers.  A simple count doesn't sound like it'll capture the real influence.  Even if you assume that Wikipedia has a comprehensive and unbiased coverage of influential people (almost certainly not true), shouldn't influencing someone influential count more? That would imply you need to do a page-rank style aggregation of link weights. Tom uMany thanks for your input Tom. An in-degree is the number of incoming edges towards that node. I think that captures *influencer: influencee (Aristotle : Alexander, Aristotle : Myself)*, which means, in this scenario, Aristotle (the node), has an in-degree of 2. I thought in-degree was a measure of influence rather than an outdegree. Remember, this is going to be plotted as a *directed* graph in Gephi. I'll be curious to know how I'll actually distinguish in-degrees and out-degrees in Gephi practicaly, but anyway. Moreover, I didn't quite get about how I could do a Page-Rank style style aggregation on this specific scenario. Could you please provide some examples using actual person names so I can digest it well in my head. Thanks for getting my head working though, but I still believe the Wikipedia data gives you a decent impression of influence to an extent, albeit not the most accurate, but it kind of appears to be right in one way or the other. On Sat, Dec 21, 2013 at 7:49 PM, Tom Morris < > wrote: uOn Sat, Dec 21, 2013 at 2:58 PM, Ali Gajani < > wrote: It really depends on whether your relation is influencer -> influenced -> influencee or influenced <- hadInfluencee <- influencee (ie which way the directed edges in the graph run). You can't do PageRank from just the counts. You need the full network of links. As an example, if Marx had the most direct influencees, but Aristotle influenced Marx, shouldn't that count for something? Perhaps more? BTW, Freebase actually thinks Nietzsche is first by simple count, not Marx, but the underlying data is so biased and incomplete for both Wikipedia & Freebase, that I'm not sure it's worth pursuing a more sophisticated weighting. Tom p.s. If you're using Gephi, it has a PageRank implementation u1. Right, it depends on exactly that, the direction of the edges. At the moment, my query returns influencer: influencee, so that clearly means if the edges run towards the first column in the table, which is influencer. Am I right? 2. I know we can't do PR from counts. I didn't mean that at all. You are right, that factor does mean a lot (Aristotle influenced Marx), and there can be even more complicated analysis done, but at this stage, I am just getting the feel for the data. Thanks to you, new ideas like PR have popped up which I aim to utilize through Gephi. Moreover, how since you mentioned Freebase, I'd like to ask you if you know how can I use the influence node to do the same query that I am doing here on DBPedia. I gave up because I couldn't get it to work and moreover it returns JSON so I thought I will have problems doing a UNION of DBPedia with Freebase results. If you can give me some insight into this, I'd be able to perform my analysis with more data. Duplication won't be an issue due to the SQL query handling that, it will still give me some more data which is good. Thanks Tom. On Sat, Dec 21, 2013 at 8:09 PM, Tom Morris < > wrote:" "Missing wikiPageWikiLink's at sparql endpoint" "uHello, I was wondering why the extracted wiki links ( included in the sparql endpoint. Would be interesting information. Best, Dominic uHi Dominic, You can use Sindice Sparql endpoint [1] to get wiki links. Cheers, Nitish [1] On 29 Jul 2012, at 01:39, Dominic Looser wrote:" "Moving DBpedia mappings wiki to RDF - feedback needed" "uDear all, we plan to move away from the mappings wiki in the near future and use RDF for storing the mappings. We need some feedback from the community, especially from mapping & ontology editors, on which syntax to use. Please read the doc below and cast your vote & comments at the end edit" "Fake Conference Fake: WORLDCOMP and Hamid Arabnia" "uFake Conference Fake: WORLDCOMP and Hamid Arabnia Have you heard of any international conference that has no chair (or organizer or committee members) for it and the conference is going to take place in the next 2-3weeks? If you didn’t know about it then you must visit the website or or contains the details of world’s biggest fake conference in computer science, WORLDCOMP, organized by Hamid Arabnia from University of Georgia who is hiding behind the scenes (like a theif) but collecting the registration fee for the conference quietly. WORLDCOMP has many credentials and you can get those details by accessing the above websites or by searching Google using the keywords: worldcomp fake Shame Hamid Arabnia!" "Get the number of users who have edited a particular article" "uHi people, I have to choose important Wiki articles from a bunch of them. So, I was thinking of doing that by the number of users who have edited that page. Is it possible to get that information using DBpedia ???? uHi Somesh, editor information is not available in DBpedia but you could use page links (inlinks) as an indicator as this is the usual approach to rank pages. Cheers, Anja On Jun 21, 2012, at 6:48, Somesh Jain < > wrote: uOn 6/21/2012 12:48 AM, Somesh Jain wrote: Data dumps with history are available from Wikipedia so it would be possible to derive this if you have a lot of time to download and parse the files. As others have said, a simple count of page links into a page is a very good importance measure. Edit count or # of unique editing users is probably more a metric of controversy than quality. Anyway, there are zillions of cool things you could mine from the history. uHi Somesh, sorry for the late reply. On 06/21/2012 06:48 AM, Somesh Jain wrote: In DBpedia-Live, we use a contributor extractor which extracts information about the contributor of the page, i.e. the editor of the page. This information could be helpful in your case. So if you want to get the number of people contributed to the article of Paris as an example, you can use the following query: SELECT count(?o) WHERE { ?o }" "Bls: Problem in creating Indonesian Chapter" "uHi Dimitris, Thank you for reply, Yes, I think there was a proxy configuration behind  And if I want to use domain name id.dbpedia.org, what should I do?   Regards, Riko Dari: Dimitris Kontokostas < > Kepada: Riko Adi Prasetya < > Dikirim: Kamis, 16 Mei 2013 20:08 Judul: Re: [Dbpedia-discussion] Problem in creating Indonesian Chapter Hi Riko, This should be the ip/hostname of your server. Do you have any special proxy configurations behind Cheers, Dimitris On Thu, May 16, 2013 at 11:33 AM, Riko Adi Prasetya < > wrote: Hi All, uHi Dimitris, Problem solved, the problem is caused the host header from the request is not forwarded. Thank you   Regards, Riko Dari: Dimitris Kontokostas < > Kepada: Riko Adi Prasetya < > Cc: \" \" < > Dikirim: Jumat, 17 Mei 2013 14:05 Judul: Re: [Dbpedia-discussion] Problem in creating Indonesian Chapter Hi Riko, I am not an expert on this but if you set this up with apache ProxyPass, ProxyPassReverse should also be set. If it works correctly with the current domain, it will also work with the official one. Best, Dimitris On Thu, May 16, 2013 at 5:58 PM, Riko Adi Prasetya < > wrote: Hi Dimitris, uHi Riko, This seems to be a bug in the vad plugin.Can you add it in our issue tracker [1]? Cheers, Dimitris [1] On Sun, May 19, 2013 at 6:52 PM, Riko Adi Prasetya < >wrote: uHi Riko, The Dutch & Greek chapters use the latest vad code and it is not displayed correctly there. Submit it as a bug and I will try to fix it asap. Best, Dimitris On Tue, May 21, 2013 at 5:45 PM, Riko Adi Prasetya < >wrote:" "DBPedia" "uHi all, I am a student at the University of York, UK and I am just starting with DBPedia so i wanted to be directed at first to make a good start. My goal is to get data from DBPedia and visualize them using webGL in Javascript. What is the best way to be able to ask queries and get results in a format that can be easily read? Is there a framework that communicates with DBPedia? Thanks in advance, Dimitris uHi Dimitris, On 06/05/2012 05:49 PM, wrote: you can use the jena framework [1] to pose queries to DBpedia in sparql. You can find a good tutorial for it \"here Good luck with your work. [1] index.html uOn 05.06.12 17:49, wrote: You should try the examples here: Also for Javascript I can highly recomment rdfstore-js: cu Adrian" "GSoC List-Extractor" "uDear community, This is a link to GSoC progress and description of my project, i.e. the List Extractor . You may also want to take a look at the repo containing project code, extracted datasets and \"how to run\" instructions. TL;DR: I have successfully extracted list information from Wikipedia pages of actors and writers (both for english and italian), and provided quality evaluations. The final result is a Python program focused on modularity and expandability, since hopefully everyone can add new domains and languages to extend its potential. *Student: Federica BaiocchiMentors: Marco Fossati (main), Claudia Diamantini, Domenico Potena, Emanuele Storti* Best regards, Federica. Dear community, This is a link to GSoC progress and description of my project, i.e. the List Extractor . You may also want to take a look at the repo  containing project code, extracted datasets and 'how to run' instructions. TL;DR: I have successfully extracted list information from Wikipedia pages of actors and writers (both for english and italian), and provided quality evaluations. The final result is a Python program focused on modularity and expandability, since hopefully everyone can add new domains and languages to extend its potential. Student: Federica Baiocchi Mentors: Marco Fossati (main), Claudia Diamantini, Domenico Potena, Emanuele Storti Best regards, Federica." "License DBpedia as CC-BY-SA? was: license update at the Wikimedia Foundation" "uHoi, I am really pleased that you are willing to follow the WMF in this but as I have argued on this list in the past, DBpedia is a collection of factsin essence they can individually not be copyrighted. In the past I have had the WMF say that they are happy for extractions like the DBpedia ones under any license. It would be beneficially when DBpedia allowed for a more liberal licenseI am sure that it can and may. In this reply I have added Mike Godwin, the lawyer of the WMF, and I hope that he is willing to shed some light on this. Thanks, GerardM 2009/5/21 Chris Bizer < > Hoi, I am really pleased that you are willing to follow the WMF in this but as I have argued on this list in the past, DBpedia is a collection of factsin essence they can individually not be copyrighted. In the past I have had the WMF say that they are happy for extractions like the DBpedia ones under any license. It would be beneficially when DBpedia allowed for a more liberal licenseI am sure that it can and may. In this reply I have added Mike Godwin, the lawyer of the WMF, and I hope that he is willing to shed some light on this. Thanks,       GerardM 2009/5/21 Chris Bizer < > Hi Gerard and all,  as it is the goal of the DBpedia project to make extracted Wikipedia data as freely and easily usable as possible, I think we are happy to follow the Wikipedia license change and will make the next DBpedia release available under CC-BY-SA license.  Are there any objections against this policy by anybody?  I think that we would even be willing to go a step further and release DBpedia under a even more liberal license, but we don’t know if we are allowed to do this.  We are not lawyers and there seem to be tricky points about databases being creative works and differences between European and US law on this. We also don’t know if extracting the first 250 words of each Wikipedia article and putting them into a database is creative enough to justify to have this database under a more liberal license then Wikipedia.  If somebody with a solid legal background knows the answers to these questions, we would be more than happy if he could give us some advice.  Cheers,  Chris     Von: Gerard Meijssen [mailto: ] Gesendet: Donnerstag, 21. Mai 2009 09:09 An: dbpedia-discussion Betreff: [Dbpedia-discussion] license update at the Wikimedia Foundation  Hoi, As expected the Wikimedia Foundation will change its licensing. In the past it was said that DBpedia would follow suit. It is important that it does when you assume that DBpedia needs a license in the first place. As there is little time left in which this change can be made, I urge you all to follow the WMF. Thanks,       GerardM ' Result uHi Gerard and all, as it is the goal of the DBpedia project to make extracted Wikipedia data as freely and easily usable as possible, I think we are happy to follow the Wikipedia license change and will make the next DBpedia release available under CC-BY-SA license. Are there any objections against this policy by anybody? I think that we would even be willing to go a step further and release DBpedia under a even more liberal license, but we don’t know if we are allowed to do this. We are not lawyers and there seem to be tricky points about databases being creative works and differences between European and US law on this. We also don’t know if extracting the first 250 words of each Wikipedia article and putting them into a database is creative enough to justify to have this database under a more liberal license then Wikipedia. If somebody with a solid legal background knows the answers to these questions, we would be more than happy if he could give us some advice. Cheers, Chris Von: Gerard Meijssen [mailto: ] Gesendet: Donnerstag, 21. Mai 2009 09:09 An: dbpedia-discussion Betreff: [Dbpedia-discussion] license update at the Wikimedia Foundation Hoi, As expected the Wikimedia Foundation will change its licensing. In the past it was said that DBpedia would follow suit. It is important that it does when you assume that DBpedia needs a license in the first place. As there is little time left in which this change can be made, I urge you all to follow the WMF. Thanks, GerardM ' Result uOn Thu, May 21, 2009 at 9:48 AM, Gerard Meijssen < > wrote: Yes, people keep repeating thatyet EU law, at least in NL, disagreesJust pointing to this USA law is not useful outside the USA. That would actually be interesting. If they make such s public statement, that would clarify the intention and address much of the legal confusion. Can you please point me to this public statement of the WMF that states that data extractions (like info from ChemBoxes) is free to anyone to be used in any way they want to (like license under CC0)? Egon uHoi, Wikipedia does not allow for original research. This means that a Wikipedia is a repackaging of what is already known. Individual facts if they can be copyrighted in the first place, are therefore not for the WMF to license. As to the licensing of data mining Wikipedia for research and, this is what we are talking about when we consider DBpedia, I had asked it a long time ago on behalf of a group of linguists that used Wikipedia as a corpus. This was all done by mail. I am sure that Mike may be willing to say something about this. Thanks, GerardM 2009/5/21 Egon Willighagen < > uHi Mike, Do you thing we are allowed to do this? Thanks a lot for your help. Chris uChris Bizer wrote: I think database copyright law is hairy. I think it's safe in all countries for Dbpedia to have the same license as wikipedia (which will be CC-BY-SA) In the US, where I am, databases aren't copyrightable. My understanding is the law is different in the EU: I know a lot of dbpedia people live in the EU, so they are certainly affected. My understanding is that Freebase uses Wikipedia data and WEX; they've certainly extracted a list of topics from wikipedia. They're based in the US so they can probably claim they're under the US jurisidiction. I've got no idea what that means for international uses of Freebase. Although I personally benefit from the lassez-faire situation in the US, I'm wondering if the stronger db copyright laws in the EU are more reflective of current realities. If I were going to, say, compile a list of US presidents and their birth dates, I could certainly use a manual process to copy these out of the Encyclopedia Britannica. You can't say a fact like dbpedia:Abraham_Lincoln someschema:bornOn \"February 12, 1809\" . belongs to anybody. On the other hand, now that automated processes can extract facts from documents, it seems like the \"sweat of the brow\" argument is getting weaker. uHi all, Catching up on a question from 21 May. On Thu, 21 May 2009, Chris Bizer wrote: The Feist case in the USA shows that you can copy a telephone book without getting into problem with the copyright. The individual fact can probably not copyrightable if they are sufficiently short. I think that 250 works would be long enough to have copyright on its own: The phrase \"E.T. Phone Home\" was copyrightable in the USA - if I remember correctly. The copyright law for Wikipedia and DBpedia as a collection is different between USA and countries of the European Union, due to the EU Database Directive of 1996. A collection of fact and data may be protected in the EU, regardless of whether the individual data items are copyrightable or not. That means that you cannot copy a 'substantial part' of a database if the database constitutes a 'substantial investment'. It is possible (in my opinion quite) that the set of Wikipedia templates constitute a database, so that the authors (collectively?) has database right. So perhaps European Wikipedia authors can claim database right against the European(?) DBpedia. But the situation is very hazy due to the international nature of Wikipedia/DBpedia and its authors and re-users. Wikipedia is in USA so perhaps the European Directive does not reach there even thought authors and database copier are European I guess the DBpedia cannot claim (European database) copyright, because DBpedia does not 'construct' the data: This is what the Wikipedia authors do. In a Danish court case (Ofir v. Home) the company Home lost its database right to the Ofir search engine because Home did not construct the data itself - that was done by Home's franchise takers. Somewhat similarly was the 'Hill case' in the European Court. My guess is that the 'safest' for DBpedia would be to honor Wikipedia authors database rights and just re-distribute the data under the same license that Wikipedia uses. I am not a lawyer. I have just recently looked into the issue due to my own database-like wiki. And I must say I find the situation very hazy. Maybe constructive examples of the use of database right, such as a copylefted DBpedia could help pave the way for a global clarification. /Finn Finn Aarup Nielsen, DTU Informatics, Denmark Lundbeck Foundation Center for Integrated Molecular Brain Imaging" "DBpedia - how to contribute" "uHi, we are three students of the Masters Degree in Teaching of Informatics, from the University of Minho, Braga, Portugal. We are interested in contributing for your project DBPedia, as we must do a project in a class of the mentioned Masters Degree. So our question is, how can we contribute to DBPedia project? Please send us as much information as possible, as this is the first time that we are envolved in this kind of project. Yours Trully, Paulo Torres, César Araújo, Raquel Santos Hi, we are three students of the Masters Degree in Teaching of Informatics, from the University of Minho, Braga, Portugal. We are interested in contributing for your project DBPedia, as we must do a project in a class of the mentioned Masters Degree. So our question is, how can we contribute to DBPedia project? Please send us as much information as possible, as this is the first time that we are envolved in this kind of project. Yours Trully, Paulo Torres, César Araújo, Raquel Santos uWhat do you want to contribute , code or documentation or both? On Thursday, March 20, 2014, César Araújo < > wrote: uHi guys and welcome to our community! Why don't you have a look at the open issues [1] if you want to contribute to the codebase? Cheers! [1] On 3/20/14, 12:03 PM, Ali Gajani wrote: uAlso, check this ideas page [1] for brand new stuff. Note however that some of them will be implemented as part of the Google Summer of Code 2014 (we are currently in the selection phase). Cheers! [1] On 3/24/14, 10:44 AM, Marco Fossati wrote:" "Download the current version of DBPedia Datasets, in JSON / XML" "uHi, For a school project, i've to use semantic data. So i would like to use DBPedia. But, the available datasets on this page ( don't manage to use this dataCan i download all dbpedia datasets in json ? Like in this page, for example : Or in rdf ? xml ? I would like to have one file by entity. I tried to do it myself, by using your source : But i don'k understand how it works. How can i generate json (or xml / rdf) files ? Thx, Julien Hi, For a school project, i've to use semantic data. So i would like to use DBPedia. But, the available datasets on this page ( Julien uHi Julien, On Thu, Jan 13, 2011 at 16:38, Julien < > wrote: Well, this data was extracted using the most recent English Wikipedia dump ( Wikipedia dumps of around the same time. The files are bzip2 compressed N-Triples, a serialization of RDF. From that there are straightforward ways to transform the data into JSON or XML. Hope this helps. Cheers, Max uHi, Thx for this reply On Thu, Jan 13, 2011 at 5:26 PM, Max Jakob < > wrote: Ok. Yes, but there is one file for Ontology, one file for Titles, another for abstractsetcHow can i merge all this files in order to have ALL informations about one wikipedia-page in the same file ? In this page : all data about this entity are in this page. How can i do this ? uyou can't ;) unless you set up your own server but you can download all the json's you want and store them on you computer cheers, Jim Yes, but there is one file for Ontology, one file for Titles, another for abstractsetcHow can i merge all this files in order to have ALL informations about one wikipedia-page in the same file ?  you can't ;) unless you set up your own server but you can download all the json's you want and store them on you computer cheers, Jim uHi, one wikipedia-page in the same file ? Are you using allmighty unix? Then you would have all the tools at hand, in order to do a hash based split based on the subjects of the triples. with \"bzcat *\" you could merge ALL files (in a directory) into a single stream. (If you just want everything in a single huge file, youre done now) Then you pipe the stream into a little script something like: #split each line into subject and remainder |while read subject rest; do h=compute hash based on subject # use the hash as a filename, and append our current line to that file echo \"$subject $rest\" >> $h done | This would cause all triples with same subject to end up in the same file. Afterwards you could use \"sort -u filename\" to sort the lines in a file, which would result in all triples with the same subject to be in consecutive rows. (the -u would remove potential duplicate rows) Kind regards, Claus On 01/13/2011 05:40 PM, Julien wrote: uHi, >> How can i merge all this files in order to have ALL informations about one wikipedia-page in the same file ? Are you using allmighty linux? Then you would have all the tools at hand, in order to do a hash based split based on the subjects of the triples. with \"bzcat *\" you could merge ALL files (in a directory) into a single stream. (If you just want everything in a single huge file, youre done now) Then you pipe the stream into a little script something like: #split each line into subject and remainder |while read subject rest; do h=compute hash based on subject # use the hash as a filename, and append our current line to that file echo \"$subject $rest\" >> $h done | This would cause all triples with same subject to end up in the same file. Afterwards you could use \"sort -u filename\" to sort the lines in a file, which would result in all triples with the same subject to be in consecutive rows. (the -u would remove potential duplicate rows) Kind regards, Claus On 01/13/2011 07:11 PM, Dimitris Kontokostas wrote: uHi, On Thu, Jan 13, 2011 at 7:11 PM, Dimitris Kontokostas < > wrote: Just by dowloading all files one by one ? Rhaaa. I would like to have just on archive. How i can do it ? I don't find any (recent) documentation :( Thx ;-) Julien. uHello, On Thu, Jan 13, 2011 at 7:34 PM, Claus Stadler < > wrote: It is interesting, but that seems complicated. I explain : - In file \"titles\", you have : \"AccessibleComputing\"@en . - In file \"Ontology\", you have : cycad cycadophytes - In file \"Images\", you have : . - It's too différentall files have a différent formatHow do you do on dbpedia.com ? A still-ready-program exists ? Where can i look for on SVN ? uit's not as complicated as you thing;) as Max and Claus said, these files are just a simpler (but equivalent) rdf representation you may be missing some background knowledge > For a school project, i've to use semantic data. So i would like to use what exactly do you want to do and there might be a simple solution you may also take a look at the following links or google about linked data, sparql, ontologies etc it's not as complicated as you thing;) as Max and Claus said, these files are just a simpler (but equivalent) rdf representation you may be missing some background knowledge > For a school project, i've to use semantic data. So i would like to use > DBPedia. what exactly do you want to do and there might be a simple solution you may also take a look at the following links or google about linked data, sparql, ontologies etc guides-and-tutorials" "Pagelinks dataset" "uHi, I'm Dario Garcia-Gasulla, an AI researcher at Barcelona Tech (UPC). I'm currently doing research on very large directed graphs and I am using one of your datasets for testing. Concretly, I am using the \"Wikipedia Pagelinks\" dataset as available in the DBpedia web site. Unfortunately the description of the dataset is not very detailed: Wikipedia Pagelinks /Dataset containing internal links between DBpedia instances. The dataset was created from the internal links between Wikipedia articles. The dataset might be useful for structural analysis, data mining or for ranking DBpedia instances using Page Rank or similar algorithms./ I wonder if you could give me more information on how the dataset was built and what composes it. I understand Wikipedia has 4M articles and 31M pages, while this dataset has 17M instances and 130M links (couldn't find the number of links of Wikipedia). What's the relation between both? Could someone briefly explain the nature of the Pagelinks dataset and the differences with the Wikipedia? Thank you for your time, Dario. Hi, I'm Dario Garcia-Gasulla, an AI researcher at Barcelona Tech (UPC). I'm currently doing research on very large directed graphs and I am using one of your datasets for testing. Concretly, I am using the \"Wikipedia Pagelinks\" dataset as available in the DBpedia web site. Unfortunately the description of the dataset is not very detailed: Wikipedia Pagelinks Dataset containing internal links between DBpedia instances. The dataset was created from the internal links between Wikipedia articles. The dataset might be useful for structural analysis, data mining or for ranking DBpedia instances using Page Rank or similar algorithms. I wonder if you could give me more information on how the dataset was built and what composes it. I understand Wikipedia has 4M articles and 31M pages, while this dataset has 17M instances and 130M links (couldn't find the number of links of Wikipedia). What's the relation between both? Could someone briefly explain the nature of the Pagelinks dataset and the differences with the Wikipedia? Thank you for your time, Dario. uHi Dario, the dataset you are using is extracted by the org.dbpedia.extraction.mappings.PageLinksExtractor [1]. This extractor collects internal wiki links [2] from Wikipedia content articles (that is, wikipedia pages which belong to the Main namespace [3]) to other wikipedia pages (please note I am not talking about content articles here, because also links to pages in the File or Category namespaces are collected). Each row - triple - in the Pagelinks represent a directed link between two pages, e.g. . means that an internal link to found in You can check this link exists here (first sentence) [6] Basically this can be modeled in a directed graph as an edge \"Albedo -> Latin\" The reason why you have 17M instances (I suppose you are counting the nodes in your graph) is because objects in each triple can be outside the Main namespace. As far as I remember, 4M articles are wiki pages with belong to the Main namespace and which are neither redirects [4] nor disambiguation pages [5]. Hope this clarifies a bit :-) Cheers Andrea [1] [2] [3] [4] [5] [6] 2013/12/2 Dario Garcia Gasulla < > uIn addition to Adrea's reply, we also collect the \"red links\" which means links to pages that do not exist (yet). On Tue, Dec 3, 2013 at 11:32 AM, Andrea Di Menna < > wrote: uSomething I found out recently is that the page links don't capture links that are generated by macros, in particular almost all of the links to pages like don't show up because they are generated by the {cite} macro. These can be easily extracted from the Wikipedia HTML of course, which is what I did to pull off this project On Tue, Dec 3, 2013 at 4:32 AM, Andrea Di Menna < > wrote: uOn Tue, Dec 3, 2013 at 1:44 PM, Paul Houle < > wrote: That's good to know, but couldn't you get this directly from the Wikimedia API without resorting to HTML parsing by asking for template calls to Tom On Tue, Dec 3, 2013 at 1:44 PM, Paul Houle < > wrote: Something I found out recently is that the page links don't capture links that are generated by macros,  in particular almost all of the links to pages like Tom uI guess Paul wanted to know which book is cited by one wikipedia page (e.g. page A cites book x). If I am not wrong by asking template transclusions you only get the first part of the triple (page A). Paul, your use case is interesting. At the moment we are not dealing with the {{cite}} template nor {{cite book}} etc. We are looking into extensions which could support similar use cases anyway. Also please note that at the moment the framework does not handle references either (i.e. what is inside ) when using the SimpleWikiParser [1] What do you exactly mean when you talk about \"Wikipedia HTML\"? Do you refer to HTML dumps of the whole wikipedia? Cheers Andrea [1] 2013/12/3 Tom Morris < > uI think I could get this data out of some API, but there are great HTML 5 parsing libraries now, so a link extractor from HTML can be built as quickly than an API client. There are two big advantages of looking at links in HTML: (i) you can use the same software to analyze multiple sites, and (ii) the HTML output is often the most tested output of a system. This is particularly a problem in the case of Wikipedia markup which has no formal specification and for which the editors aren't concerned if the markup is clean but they will fix problems if they cause the HTML to look wrong. Another advantage of HTML is that you can work from a static dump file, or run a web crawler against the real Wikipedia or against a local copy of Wikipedia loaded from the database dump files. On Tue, Dec 3, 2013 at 2:30 PM, Andrea Di Menna < > wrote: u2013/12/4 Paul Houle < > Where can you get such dump from? Seems not practical Pretty slow, isn't it? Cheers! Andrea u@Andrea, there are old static dumps available, but I can say that running the web crawler is not at all difficult. I got a list of topics by looking at the ?s for DBpedia descriptions and then wrote a very simple single-threaded crawler that took a few days to run on a micro instance in AWS. The main key to writing a successful web crawler is keeping it simple. On Dec 5, 2013 4:23 AM, \"Andrea Di Menna\" < > wrote: wrote: (e.g. first anyway. references SimpleWikiParser [1] references. refer >> > wrote: Wikimedia u@Paul, unfortunately HTML wikipedia dumps are not released anymore (they are old static dumps as you said). This is a problem for a project like DBpedia, as you can easily understand. Moreover, I did not mean that it is not possible to crawl Wikipedia instances or load dump into a private Mediawiki instance (the latter is what happens when abstracts are extracted), I am just saying that this is probably not practical for a project like DBpedia which extracts data from multiple wikipedias. Cheers Andrea 2013/12/5 Paul Houle < > uThe \"DBpedia Way\" of extracting the citations probably would be to build something that treats the citations the way infoboxes are treated. It's one way of doing things, and it has it's own integrity, but it's not the way I do things. (DBpedia does it this way about as well as it can be done, why try to beat it?) A few years back I wrote a very elaborate Wikipedia markup parser in .NET, it used a recursive descent parser and lots and lots of heuristics to deal with special cases. The purpose of it was to accurately parse author and licensing metadata from Wikimedia Commons when ingesting images into Ookaboo. I had to do the special cases that because Wikipedia markup doesn't have a formal spec. I quickly ran into a diminishing returns situation where I had to work harder and harder to improve recall and get deteriorating results. I later wrote a very simple parser for Flickr which just parsed the HTML and took advantage of the \"cool URIs\" published in Flickr. Today I think of it as pretending that the Linked Data revolution has already arrived, because really if you look at the link graph of Flickr, there is a subset of it which isn't very different from the link graph of Ookaboo. Anyway, I needed to pull some stuff out of Wikimedia Commons and it took me 20 minutes to modify the Flickr parser to work for Commons and get at least 80% of the recall that the old parser got. On Thu, Dec 5, 2013 at 10:29 AM, Andrea Di Menna < > wrote:" "Next DBpedia release" "uHi, When is the next DBpedia release anticipated? At the moment I'm using 3.6 but its nearing 6 months now Thanks Az Hi, When is the next DBpedia release anticipated? At the moment I'm using 3.6 but its nearing 6 months nowThanks Az uHi Azhar, Max Jakob will start implementing various improvements to the extraction code 1st of June and will also look into merging the improvements from the internationalization branch into the main DBpedia codebase. After finishing with this, he will generate and publish a new release. Likely sometime in July. In parallel, there is ongoing work on DBpedia Live in Leipzig and it is possible that DBpedia Live might online in the next weeks. Sören, Jens, Sebastian: Is there already a fixed timescale for this? Cheers, Chris Von: Azhar Jassal [mailto: ] Gesendet: Freitag, 6. Mai 2011 02:02 An: Betreff: [Dbpedia-discussion] Next DBpedia release Hi, When is the next DBpedia release anticipated? At the moment I'm using 3.6 but its nearing 6 months now Thanks Az uHi Azhar, Max Jakob will start implementing various improvements to the extraction code 1st of June and will also look into merging the improvements from the internationalization branch into the main DBpedia codebase. After finishing with this, he will generate and publish a new release. Likely sometime in July. In parallel, there is ongoing work on DBpedia Live in Leipzig and it is possible that DBpedia Live might online in the next weeks. Sören, Jens, Sebastian: Is there already a fixed timescale for this? Cheers, Chris Von: Azhar Jassal [mailto: ] Gesendet: Freitag, 6. Mai 2011 02:02 An: Betreff: [Dbpedia-discussion] Next DBpedia release Hi, When is the next DBpedia release anticipated? At the moment I'm using 3.6 but its nearing 6 months now Thanks Az" "SNORQL code/installation" "uHi, We are using a Sesame 2 server installation in our team and would like to browse our RDF data with the SNORQL interface. Can someone point me to the right place for getting the code or some information on how to do this? Thank you very much. Kind regards," "AdministrativeRegions appear when querying for Countries." "uHi all, I am trying to learn more about the Semantic Web, in particular SPARQL thus I set out on a simple task; to list all the countries in the World that are currently classed as such and then get a cumulative population. Should get around 200 countries and about 7 billion people in the end I assumed. some are mislabeled in Wikipedia with Country Infoboxes which is fine (I have corrected a couple of these but obviously not synced), but others are marked up with Infobox Former Subdivision which according to my calculation should map to the AdministrativeRegion class, anyone know why they are appearing in this query? PREFIX dbo: PREFIX dbp: select distinct ?country where { ?country a dbo:Country . OPTIONAL { ?country dbo:dissolutionYear ?x } . FILTER (!bound(?x)) } ORDER BY ?country In particular: * * * Many thanks, Alex DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" Hi all, I am trying to learn more about the Semantic Web, in particular SPARQL thus I set out on a simple task; to list all the countries in the World that are currently classed as such and then get a cumulative population. Should get around 200 countries and about 7 billion people in the end I assumed. in Wikipedia with Country Infoboxes which is fine (I have corrected a couple of these but obviously not synced), but others are marked up with Infobox Former Subdivision which according to my calculation should map to the AdministrativeRegion class, Does anyone know why they are appearing in this query? PREFIX dbo: < Alex" "Best method for local DBpedia live mirror" "uHi, I'd like to ask what is the preferred method, also looking into future support, for maintaining a local DBpedia live mirror. I believe the two options are: 1. Setting up a local DBpedia mirror and downloading the periodic Changesets to update the DBpedia store. 2. Setting up a local DBpedia mirror and using the DBpedia Information Extraction Framework to interface with a local Wikipedia mirror as described here: Thanks for your help! Emery" "videos on dbpedia" "uThere are 75,368 videos on wikimedia ( I want to find as many videos associated with dbpedia entries as possible. What is the best way to search videos on dbpedia? The following query on dbpedia core returns only 75 videos: SELECT (count(*) as ?num) WHERE { ?dbpediaId dbp:filename ?filename FILTER (regex(str(?filename), '.*\\.(ogv|WebM)$')) } There are 75,368 videos on wikimedia ( uRunning SELECT count(*) WHERE {?s a dbpedia-owl:MovingImage.} on gives you ~50K videos Associating these with dbpedia.org resource is something we do not support right now we plan to do this for images with the help of HPI but nothing planned for videos atm @magnus any ideas for this? On Thu, Jun 23, 2016 at 7:18 PM, Joakim Soderberg < > wrote: uYes, we are currently extending the image extractor to collect all image references in articles linking to commons. This will resemble a dataset of links from dbpedia resources to resources in commons.dbpedia.org. Videos and Audio are embedded in various ways in wikipedia articles, e.g. Template {{Audio | en-us-Alabama.ogg | /ˌæləˈbæmə/}} Template {{Listen | filename = Accordion chords-01.ogg | title = Accordion chords | description = Chords being played on an accordion}} Link to the File:-Namespace [[File:Videoonwikipedia.ogv | thumb | thumbtime=1:58 | right | 320px | Tutorial video for 'Video on Wikipedia' (''how very meta!'')]] I’ll check if we can embed such as well." "Linked Data Meetup London - Program and Sign-Up" "uDear all, the Linked Data Meetup in London on September 9th now has a program and a signup page: This meetup is for people who are interested in learning about Linked Data on the Web, people working on making data available on the web, people building infrastructure and applications for the Web of Data, and people who are interested in how to apply Linked Data technologies within their enterprises. The response to our invitation to bring the Linked Data community together for an event in London has been amazing so far: 70 people signed up already, and we just started to reach out. In addition to presentation and lightning talk sessions with a great lineup of speakers, we will also host two panels with special guests: \"Linked Government Data\" and \"the Future of Journalism\". For the detailed program, please visit the meetup page above. See you in London, Georgi & Silver" "ESWC 2014 Students Participation Support" "uapologies for cross-posting ESWC 2014 Students Participation Support 11th Extended Semantic Web Conference (ESWC) 2014 Dates: May 25 - 29, 2014 Venue: Anissaras, Crete, Greece Hashtag: #eswc2014 Feed: @eswc_conf Site: General Chair: Valentina Presutti (STLab, ISTC-CNR, IT) We are pleased to announce that ESWC will distribute a total budget of 4,000 Eur among full-time students for supporting their participation in ESWC 2014. All full-time students are eligible to participate in the selection. The number of awards will be limited, and the amount of each award will depend on the number of participants and will be provided in the form of reduced registration fee. In order to apply, please be sure to execute all the following actions: - register at [1] by choosing the Student Participation Fee by April 20th (skip the payment step) - send an email to by April 20th acknowledging your registration and providing your details (name and affiliation) - have your advisor email by April 30th a recommendation to in which he/she states support for your application, confirms your status as a full-time student and indicates your possible authorship or co-authorship of a paper accepted in any of the ESWC 2014 tracks To receive full consideration, both registration and advisor letters must be received by the indicated deadlines. If you have already registered and payed you are still eligible to participate in the selection. You will have to perform the last two steps, and in case you will be selected, you will be reimbursed of the amount of the award. Notifications will be sent by May 9th, 2014. [1] registration" "MyStrands Links Music Recommendations To Wikipedia Info" "uLots of intersting stuff showing up today. This article about using Wikipedia in your applications is also good: Cheers Chris" "Download server down and wikipedia" "uThanks for the reply. The server is not responding from days, may be five or more. It's practically impossible to download any datasets. Someone know if it is possible in another way, or more important, when the server will be back online ? Cecilia is my first email here, so that's an hello to everyone. Ciao Cecilia, welcome and nice to read you! :-) datasets from more than three days but server seems There are some problems at the internet connection hosting the DBpedia dumps ( AFAIK, there are no other mirrors available. following. Really, I'm trying to download again some datasets that seem first one middle of a server or if Probably it's just a temporary problem with the server. Cheers, roberto Non temiamo alcun confronto: Tiscali ha l'Adsl più veloce d'Italia!Risparmia con Tutto Incluso Light: Voce + Adsl 20 mega a soli 17,95 € al mese per 12 mesi. ?WT.mc_id=01fw uHi, Try this one: Best, Vladimir 2011/6/15 < >:" "Problems with the YAGO hierarchy" "uI realize this is probably not the best place for this, but I couldn't quickly find a place to discuss YAGO. On the surface, the YAGO type assertions look quite useful in dbpedia, like: dbr:Albert_Brooks a yago:Actor109765278 . dbr:The_Sting a yago:MotionPictureFilm103789400 . but what is e.g. yago:MotionPictureFilm103789400? Well, yago:MotionPictureFilm103789400 rdfs:subClassOf yago:Film103338821 . OK so far, but up one more level leads to: yago:Film103338821 rdfs:subClassOf yago:PhotographicPaper103926412 Oops. That's not, then, the sense I want for the type of dbr:The_Sting. My limited understanding of YAGO is that it is mostly based on WordNet - is this just a problem created by mapping to the wrong synset for \"MotionPicture\"? Hopefully, yes, but I couldn't find the \"right\" one. There are a few other examples. -Chris PS above: PREFIX dbr: PREFIX yago: uHi Chris, Yago:MotionPictureFilm103789400 is the only handpicked type assertion in DBpedia-Yago, and it seems we actually got it wrong. The official Yago distribution doesn't contain films (because of some license-questions I think), so I added MotionPictureFilm103789400 manually. All other type assertion are based on Yago's algorithm. I think we should change that to Film106613686 (see [1]) which seems to me to be right. What do you think? Could you provide some examples from other domains than films please, would be interesting for later investigation. Cheers, Georgi PS: The Sting is an excellent movie! [1] WordNet.jsp?synset=106613686 uThat looks correct. I thought I had other examples but they turn out to be problems with wordnet hypernymy choices, not with YAGO. -Chris Georgi Kobilarov wrote: uNot sure how to do a thorough test, but found another anomoly: Musicals are mapped to the only wordnet synset for musical (107019172), which is a hyponym of play in the sense of an event (107018931), whereas plays are mapped to the wordent synset for play in the sense of a composition (107007945). Thus there is no common superclass of plays and musicals This is partly a wordnet problem, for some reason there is no composition sense for musical, just the performance sense. I'd agree that the play mapping is the right one for plays, and I see the musical mapping is the best you could do, but I think I'd prefer musicals to be mapped to the proper sense of play than to the wrong (but only available) sense for musicals. The corrected mapping you suggested below for movie also has the problem that it has no common superclass with play. Again that's a wordnet problem, I don't know why there isn't a sense (that I could find) for movie of a dramatic composition. Just an oversight I guess. -Chris Georgi Kobilarov wrote: uHi guys, First-off a confession; it was me who chose MotionPictureFilm103789400! Having said that, I'm not convinced it's wrong :) If you go far enough up from Film106613686 (the other proposal) you reach SocialEvent107288639, which seems equally incorrect to me, if not more so. For example, the *creative work* (which is what we're talking about, right?) known as \"Dodgeball\" that has a representation on the DVD which sits on my shelf might not be a subClass of photographic paper, but I'd argue it's even less a subClass of social event. (For the record, Film106262567 is another contender worth considering, but also not quite right). I'm afraid I don't really have a solution, other than to say lets not rush into a change that might also be suboptimal. All I can offer are the observations that: (from tedious experience) selecting these manually is really hard; doing so automatically is also pretty error prone ;) Maybe the best thing to bear in mind is which properties of \"films\" do we want to be able to model in DBpedia? I guess this should guide our choice of class. Cheers, Tom. On 16/10/2007, Georgi Kobilarov < > wrote: uThis message does not appear to have made it to the discussion list. I'll try again. Chris Welty wrote: uDid you get my other message concerning this ? Neither your message nor mine seems to have made it to the discussion list. -Chris Tom Heath wrote: uThe usual treatment of creative works is with something analagous to a semiotic triangle that links the creative work, author, and the physical manifestation. So i agree, neither of these choices is correct for movie. The problem appears to be that Wordnet (at least this version) doesn't have a synset for the creative work in a movie - there should be a synset that is more or less a sibling of novel, poem, etc., but I couldn't find one. As I said in a previous post (that doesn't appear to have made it to the list), there is a synset (107007945) for \"play\" that captures the creative work, but not one for musical. Apparently the wordnet team is not interested in entertainment. So personally I'd be happiest mapping movie (and musical) to that synset for play (107007945), as it seems to be the closest thing in wordnet to the sense of movie (and musical) that the wikipedia pages represent. I invited Alessandro Oltramari in as he knows a lot about treating WN as an ontology -Chris Tom Heath wrote:" "Idiots guide to dbpedia Part II" "uI got better results when I made resource name = case-sensitive. So I got a whole lot of links to other Wikipedia resources. I must figure out how to order/organise this in some ontology. I can build an ontology manually, using Dolce possibly as a starting point (top-down approach) but being a lazy bastard I thought I could get dbpedia to do it for me :) (bottom-up approach). I think Wikipedia covers most of the concepts I am interested in. The challenge is applying the Dolce pattern to stuff I pull from Wikipedia via dbpedia. Any references, tips, advice, mild abuse would be appreciated. Hi I want to build a domain ontology for wildfires. I understand dbpedia extracts linkages from Wikipedia. I tried a basic sparql query (I do not know sparql) to extract all references to fire and got something about a community fire unit. Wikipedia provides much more than that so something is wrong. Can somebody give me a 20s tutorial how to extract RDF information on wild fires from dbpedia or am I being completely naïve here? I am a Geologist by training so be kind :) Cheers" "DBpedia Live-Updates (changesets)" "uHi all, I'm working on my master thesis and my work concerns to understand the syncronization process between Wikipedia and DBpedia live-uptades (changesets). In the following I describe some of the problems I came across and I would like to have an answer: First, according to the changes made in Wikipedia and the ones reported in DBpedia, I cannot identify a corrispondence one to one. In other words, I found that there are a lot of added and removed triples for a resource in DBpedia than the changes of the same resource shown in Wikipedia history page. How does it come? I was expecting that a change in change in the Wikipedia infobox of an article is mapped in DBpedia as an added/removed triple for the same article/resource. Second, based on the structure of live-updates of DBpedia there is an incompatibility between a folder and its correspondent zip folder, e.g., if we consider zip folder 2012-09-01.tar.gz and the folder 2012-09-01, we find that there are triples that are present in the former folder and missing in the latter one. Is it caused because the system is down sometimes? In case of a positive answer, which folder should we take in consideration for our analysis? Last point but not less relevant regards to the last modified field associated with the added/removed file. I want to understand if the last modified value corresponds either to the effective time of the change carried out in a DBpedia resource or to the uploading time of the added/removed file in the changeset? Best regards, Andrea Giacomini Hi all, I'm working on my master thesis and my work concerns to understand the syncronization process between Wikipedia and DBpedia live-uptades (changesets). In the following I describe some of the problems I came across and I would like to have an answer: First, according to the changes made in Wikipedia and the ones reported in DBpedia, I cannot identify a corrispondence one to one. In other words, I found that there are a lot of added and removed triples for a resource in DBpedia than the changes of the same resource shown in Wikipedia history page. How does it come? I was expecting that a change in change in the Wikipedia infobox of an article is mapped in DBpedia as an added/removed triple for the same article/resource. Second, based on the structure of live-updates of DBpedia there is an incompatibility between a folder and its correspondent zip folder, e.g., if we consider zip folder 2012-09-01.tar.gz and the folder 2012-09-01, we find that there are triples that are present in the former folder and missing in the latter one. Is it caused because the system is down sometimes? In case of a positive answer, which folder should we take in consideration for our analysis? Last point but not less relevant regards to the last modified field associated with the added/removed file. I want to understand if the last modified value corresponds either to the effective time of the change carried out in a DBpedia resource or to the uploading time of the added/removed file in the changeset? Best regards, Andrea Giacomini u+1 on this request. The DBpedia live updates is a very powerful, unique and extremely useful functionality but I have also noticed some similar inconsistencies. Cheers, Kavi On Wed, Oct 10, 2012 at 11:43 AM, Andrea Giacomini < >wrote: uHi Andrea, and Kavi, first of all thank you all for your feedback. On 10/10/2012 01:25 PM, AboutThisDay wrote: You are right in that, and we will fix that issue and get back to you again. Actually you should consider the folder, as the zip file just compresses the folder so that anyone can download all the updates at once. So if you open \"2012-09-01.tar.gz\", you will find files called \"2012-09-01-00.tar.gz\", \"2012-09-01-01.tar.gz\", and so on, those files contains the same contents as their corresponding folder. Anyway you don't have to worry about that as the our sync-tool, which is available at care of all these details. The file called \"lastPublishedFile.txt\" is the one the sync-tool uses to know if there are more files available, and if not it keeps waiting till more files are available." "Keynote: Larry Masinter (Adobe) at Sepublica" "uGood news for all those who are wondering about Adobe and the semantic web. Larry Masinter ( will be our keynote speaker with \"Getting More Data Through the Publication Pipeline”. Larry will also participate in our round table, so if you want to discuss the semantic web and document formats used for publication,this is the right moment Dont forget to submit your paper to Sepublica," "Korean labels" "uHi all, I was just wondering why there is a Korean labels file available for downloading here [1] but it is not queryable here [2]. For example if I enter the query: select * { ?o } no korean label is returned. (There is a korean label for this resource in the labels dump file.) Also, why is DBpedia use \"ko\" for korean instead of \"kr\"? Cheers, Daniel [1] [2] sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query;=select+*+%7B+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FSouth_Korea%3E+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23label%3E+%3Fo+%7D&format;=text%2Fhtml&timeout;=0&debug;=on uOn 19 January 2012 02:42, Gerber Daniel < > wrote: 'ko' is a language code (ISO 639-1), 'kr' is a TLD. Some others are Japanese (ja)/Japan (jp), Chinese (zh)/China (cn), Ukrainian (uk)/Ukraine (ua), Irish (ga)/Ireland (ie). On the other side, 'ca' is the 639-1 for Catalan, and TLD for Canada." "DBpedia Relationship Finder Release 2" "uHello, I'd like to announce the second version of the DBpedia Relationship Finder, which can be found at: It is a small application developed by Jörg Schüppel and me, which explores the DBpedia [1] infobox dataset. It can answer questions like \"How are Leipzig and the Semantic Web related?\" [2]. The Relationship Finder provides an easy to use interface to explore the huge amount of DBpedia data. The new version uses better algorithms, offers the possibility to ignore objects and properties, and includes numerous smaller improvements. Best regards, Jens Lehmann PS: Apologies in advance if the server responses are somewhat slow. [1] [2] index.php?firstObject=Leipzig&secondObject;=Semantic_Web&limit;=1&maxdistance;=10" "DBpedia and my own wiki" "uI have recently started my own wiki, Brede Wiki. It is rather small at the moment but hopefully will grow. It is based on MediaWiki and contains results from neuroimaging studies structured with templates, such as 'Paper' and 'Brain region'. These I keep clean of nesting and formating. So far I dump the wiki content to an bzipped XML file and from that dump extract the templates and convert them to an SQL file and further on to SQLite3 file that I use in a small specialized search engine. I plan to extend the search engine capabilities, but I am now wondering to which extent it is possible to use tools associated with DBpedia to query my wiki. The wiki is available from and the dump and sqlite-SQL file available from: The temporary search engine is here: Sincerely Finn Finn Aarup Nielsen, DTU Informatics, Denmark Lundbeck Foundation Center for Integrated Molecular Brain Imaging uHi Finn, If you freshly start with a new MediaWiki and do not expect GB of data and Millions of visitors immediately I rather recommend installing the Semantic MediaWiki (SMW) extension [1]. DBpedia is very focused on the specifics of Wikipedia and will be too difficult to apply in your case. SMW on the other hand is pretty easy to install and use and gives you the opportunity to semantically structure content in your Wiki and to embed queries and their results into your pages. Hope that helps, Sören [1]" "cleaning up countries; SPARQL weirdness" "uI made this task: and started analysis here: Related to this I posted this bug: But I hit some weirdness on the endpoint that I cannot explain: select * {?country a dbo:Country} returns However, no triples?!?! Can someone from OpenLink explain this? uAs Dimitris wrote in DESCRIBE it works fine describe Guess: It might be related to the apostrophe in the URL uOn 3/22/16 8:49 AM, Vladimir Alexiev wrote: Yes, and we are looking into a fix that covers the green Linked Data pages." "Errors while adding labels and comments to OntologyClass" "uHi all, since I have been granted some hours ago the right to make changes to the ontology and the mapping I have started with the (seemingly) simple task of adding some labels and comments. So I did this: 3AAbbey&action;=historysubmit&diff;=25096&oldid;=24867 Being bold (Wikipedia style), I saved anyway but I was getting the following: *Cannot use equivalent property 'rdfs:label' *Cannot use equivalent property 'rdfs:label' *Cannot use equivalent property 'foaf:img' *Cannot use equivalent property 'rdfs:label' *equivalent class 'dct:Location' of class 'Place' not found *title=Flag;ns=200/OntologyClass/OntologyClass;language:wiki=mappings,locale=en - Ignoring invalid node '} ' in value of property 'labels'. which look quite scary. Undoing the addition (i.e. going back to the previous version) and validating leads to the same messages. So, should just keep ignoring those messages or it is something which can be fixed. Thank you. Cristian uHi Cristian, that's a problem in our validation. The errors occur in other classes, not in the one you are editing, but we check the whole ontology and show all errors. We should improve the validation and show better error messages. I don't know when we will have time. You can have look at Validate.scala [1], ExtractionManager.scala [2] and XMLLogHandler.scala [3] and try to fix this. The problem is basically that ExtractionManager.validateOntologyPages has to validate the whole ontology at once (because classes depend on each other), and XMLLogHandler collects all error messages. XMLLogHandler should collect only the error messages that are relevant for the current page. I'm not sure if that's really possible - some modifications in the current classes, e.g. changes to the type hierarchy, may cause errors in other classes - so another strategy could be that the log entry should at least show the name of the class where the error occurs. It would be cool if you or someone else could try to fix this. Here's a little introduction: Or you can add an issue report here: Cheers, JC [1] [2] [3] On 23 April 2013 00:12, Cristian Consonni < > wrote:" "Question about DBpedia Years and some suggestions for dealing with Taxa" "uHi, I am looking for useful URI's for years so that I can say the following kinds of things. I was looking at the DBpedia entries for this and found these and What I don't see is any markup that says that one year is *before* or *after * another like this 1758 AD testing I did not know if something like this would make sense or if I am missing something? Also if you look at following dbpedia-owl:binomialAuthority The object of *dbpedia-owl:binomialAuthority* should be a *foaf:Person* or list of *foaf:Person*(s) perhaps we need something else to markup the year that the taxon was described. Like *dbpedia-owl:yearDescribed, dbpedia-owl:binomialYearYou could also make these people a DBpedia subclass of *foaf:Person* like * dbpedia-owl:TaxonAuthor* (perhaps a subclass of dbpedia-owl:Scientist) The authority string like (Linnaeus, 1771) is sometimes referred to as the \"authority\", if this is in parenthesis it indicates that the original genus has been changed For example this species was originally named *Felis concolor* and had the authority string Linnaeus, 1771. e.g *Felis concolor *Linnaeus, 1771 What I would like to be able to do is have a visualization of the lifespans of different people, that also show what years they described species. This would be helpful in disambiguating taxonomic authors and finding errors in the taxonomic literature. I did not know if if would be best to link to the DBpedia years using the same DBpedia predicates or create custom URI's like above on my own, which interlink with the DBpedia URI's I have also notice that people often put trinomial names in the binomial field on Wikipedia and I was wondering if that is something that could be caught and fixed during the DBpedia processing? The binomial should consist of the genus and specific epithet e.g. \"*Puma concolor\"*, the trinomial should be the genus specific epithet and subspecific epithet like \"*Puma concolor couguar\".Also I was thinking that it might be good to capture the extinct symbol \"†\" used like †Smilodon on up the taxon with a an \"extinct\" URI something like dbpedia-owl:status dbpedia_category:extinct And remove the \"†\" character from dbpedia_prop:genus \"Smilodon\" You can also infer that if the family for this taxon is extinct than any subfamilies, genera, or species are also extinct. One last thing is that you can safely replace the \"&\" in the authority string with \"and\" in the RDF so (d'Orbigny & Gervais, 1844) => (d'Orbigny and Gervais, 1844). Respectfully, - Pete" "infobox mappings" "uHi, I have a problem with infobox Writer for Mappings (hr) For example, I have in Croatian edition of Wikipedia in wiki - mark up next situation with birth and death properties: roðenje = [[6. lipnja]] [[1854]]. [[Oplaznik]], [[Hrvatska]]| smrt = [[10. prosinca]] [[1889]]. [[Glina (grad)|Glina]]/[[Zagreb]], [[Hrvatska]]| I have written mapping for previously mentioned properties as follows: {{PropertyMapping | templateProperty = roðenje | ontologyProperty = birthDate }} {{PropertyMapping | templateProperty = roðenje| ontologyProperty = birthYear }} {{PropertyMapping | templateProperty = roðenje | ontologyProperty = birthPlace }} {{PropertyMapping | templateProperty = smrt | ontologyProperty = deathDate }} {{PropertyMapping | templateProperty = smrt | ontologyProperty = deathYear }} {{PropertyMapping | templateProperty = smrt | ontologyProperty = deathPlace }} I want to preserve all values (birthDate, birthYear and birthPlace), but when I test these mappings I get the following results for birth: As you can see from results, instead of using birthDate I was offered BirthPlace in second row. The same thing happenes with the death property (deathPlace instead deathDate). I would appreciate your advice. Regards, Ivana Hi, I have a problem with infobox Writer  for Mappings (hr) [[Hrvatska]]| smrt = [[10. prosinca]] [[1889]].
[[Glina (grad)|Glina]]/[[Zagreb]], [[Hrvatska]]| I have written mapping for previously mentioned properties as follows: {{PropertyMapping | templateProperty = roðenje | ontologyProperty = birthDate }} {{PropertyMapping | templateProperty = roðenje| ontologyProperty = birthYear }} {{PropertyMapping | templateProperty = roðenje | ontologyProperty = birthPlace }} {{PropertyMapping | templateProperty = smrt  | ontologyProperty = deathDate }} {{PropertyMapping | templateProperty = smrt | ontologyProperty = deathYear }} {{PropertyMapping | templateProperty = smrt  | ontologyProperty = deathPlace }} I want to preserve all values (birthDate, birthYear and birthPlace), but when I test these mappings I get the following results for birth: 13px;} uHi Ivana, I think it is hard to extract the birthPlace relation for this infobox and not introduce errors. This has two reasons: 1. The data for both the date and the place of birth go into the same template property. In the English Infobox_writer, the template properties birth_date and birth_place are split and it is therefore easier to extract both things with two PropertyMappings. If the properties could also be separate in the Croatian infobox, the same accurate mapping as for English could be taken. 2. The dates can have links. The extraction code currently cannot distinguish if a link is set on a place or on a date. It is assuming that all links are places (because of the mapping roðenje -> birthPlace). That is why it is extracting both the date and the place with the birthPlace relation. If we could always assume that the dates do not have links but only the places have links, the extraction would work as expected. Sidenote: the birthDate relation is not extracted because there is no support for Croatian date parsing in the framework at the moment. You could provide us with a list of months in Croatian and we could build this in. (we will be working on a way to configure these types of things more easily in the future.) Best, Max 2010/10/28 Ivana Sariæ < >: uHi Max, thank you very much for explanation. I'm glad that I can contribute with a list of months in Croatian: 1. sijeèanj = January, 2. veljaèa = February, 3. o¾ujak = March, 4. travanj = April, 5. svibanj = May, 6. lipanj = June, 7. srpanj = July, 8. kolovoz = August, 9. rujan = September, 10. listopad = October, 11. studeni = November, 12. prosinac = December Regards, Ivana 2010/10/29 Max Jakob < >" "Extraction Framework Use." "uHi! I followed instructions from: topic: \"Running a local mirror of the web service\". I thought that I needed to do that to get candidate URIs from Dbpedia, but when I run that code: I can get candidate URIs without having to do the command inside the Extraction Framework directory: ./run Server dbpedia-lookup-index-3.8 *I discovered I don't need the Extraction Framework to get candidate URIs from the Dbpedia with Lookup. So what do I use the Extraction Framework for and in what case do I need to rebuild the index as we can see in the instructions: Index\"? * *Thank you! * Hi! I followed instructions from: you! uOn Mon, Jul 22, 2013 at 11:28 PM, Luciane Monteiro < >wrote: local one. the already build index" "DBpedia Lookup" "uHi all, DBpedia provides Linked Data URIs for 2.6 million things. However, it wasn't always easy in the past for other Linked Data publishers to find a DBpedia URI. DBpedia Lookup [1] aims to fill that gap. It provides a service to find the most-likely DBpedia URIs for a given keyword. The underlying algorithm ranks DBpedia resources based on their relevance in Wikipedia and includes synonyms into the index. Try the terms \"Shakespeare\", \"EU\", or \"Cambridge\" and see for yourself if the results you'd expect show up at the top. The result ranking is different - and supposed to be more useful - than a simple full-text search or SPARQL-Query with embedded regular expression for matching labels. There is a web-service available at [2]. You can use the KeywordSearch method for searching full terms (as you see at [1]), and the PrefixSearch method for an autocompletion-style interface such as the one you see at [3]. The webservice returns a list of resource URIs with English abstracts, dbpedia classes and categories. Feel free to use the service as you like. If you plan to use it in a production system or to run a high-load batch process, please drop me a message to let me know. Thanks. I hope that DBpedia Lookup is useful for you, and I'd appreciate any feedback. Many thanks to the semantic web folks at the BBC for their support and feedback on the development of DBpedia Lookup. Cheers, Georgi [1] [2] [3] autocomplete.aspx uOn Tue, Feb 10, 2009 at 6:08 PM, Georgi Kobilarov < > wrote: Or moleculesnice! Egon uProblem: How do I find the URI of an Entity in DBpedia's data space? You've already seen the solutions from Sindice and Georgi's DBpedia Lookup service. Here is our solution, which is basically revisiting the faceted \"Search\" and \"Find\" demo we put out a few weeks ago. As per my earlier mail, we have a hot staged DBpedia instance running on Virtuoso 6.x (our next major release), and this particular release will ship with: 1. Server hosted faceted search, find, and browse 2. Web Service Interface to the browser (*see link on the browser page*). Back to the problem, let's attempt to locate the URI on an Entity associated with the pattern: Telemann. I've chosen \"Telemann\" because Hugh Glaser makes a vital point using this example in relation to how we should start to expose Linked Data utility to newcomers. Steps: 1. Go to: 2. Change focus dimension (facet) to: Type via the link in the Navigation section (it's natural to use \"Type\" filtering to disambiguate when we attempt to locate things e.g., in the real world) 3. Now you have a list or Entities by Type, you can determine what specifically you are looking for 4. I assume a planet, so I click on the \"dbpedia-owl:Planet\" link 5. Now I have one dimension of interest, I flip over to the properties of the filtered entity set by clicking \"properties containing text\" from the Navigation section 6. Then I pick foaf:name as the property that should lead me to what I want (which is the answer to my search) 7. Click on the link and get both the URI and a description Note: This functionality is available as a Web Service [1], we desperately hope XML and JSON level developers can use the Web Service interface to basically match and exceed Parallax without losing the essence of the Linked Data Web (ie. keep: URIs in plain sight of user agents via @href when producing Web information resources from linked data queries nee. Reports ). Links: 1. draft of the Web Service API for the faceted browser service 2. combining \"Search\" and \"Find\" via a server hosted faceted search and browse service. uHi Matthias, It works with HTTP GET as well, see the web-service descriptions at [1] and [2] (scroll down a bit). [1] [2] Cheers, Georgi" "querying via sparql & ranking" "uHello, I'm fairly new to dbpedia, so please forgive me if this is a stupid question: I'm trying to query dbpedia via sparql for things such as \"people who have lived in this timeframe\", and I'd like to rank the results. Are there any metrics I could use for that ? For example, is there a way to query how many other (wikipedia) pages are referring to a given article ? Any other metric I could use as a substitute for importance ? Many thanks, Stefan uHi Stefan, On 10/05/2012 08:11 PM, Stefan Seefeld wrote: the dataset titled \"Wikipedia Pagelinks\" is the one containing the information you are looking for, but this dataset is not loaded to the official endpoint. So, I would suggest you to establish your own endpoint, i.e. you can download all DBpedia dumps and load them into one of your machines and direct your queries to it. uHi Mohamed, On 10/06/2012 12:06 AM, Mohamed Morsey wrote: OK, I will try that. Is the process for setting up a local copy / endpoint documented somewhere (i.e. starting from what to download, and how to establish a local sparql endpoint for it) ? My ultimate goal is to build a tool that anyone can use to display historical data, similar to the HyperHistory Online ( dynamic, and based on community (wikipedia) data. For that, I'd really like to connect to a publicly available endpoint such as Many thanks, Stefan uHi Stefan, On 10/06/2012 03:22 PM, Stefan Seefeld wrote: this blog entry can be useful Hope it helps. uOops! Accidentally hit reply instead of reply-to-all in my previous reply. Others - please see a couple of short conversations below for continuity. ? AboutThisDay.com is a privately funded project run by a couple of developers and started pretty much from our bedrooms. The first phase of this project is a proof-of-concept piece for which we launched the beta version only a couple of months ago. We are trying to wrap up a few bits and pieces before we go live and yes – we need to prioritise our ‘About Us’ page J Check out our DBPedia announcement thread - more background. All our current data is based on the DBpedia dataset - the static DBPedia dumps as well as the Live Updates - again to the DBpedia community) With regards to your original question about ranking the results, it was the exact same problem we faced earlier in our project. As you can see in www.AboutThisDay.com, for a given day you can potentially get thousands of results. Using the category and year range filters, these results can be filtered by a great deal but we still needed a way to bubble up the most popular results to the top for quality and better user experience. For this ranking challenge, we have come up with our own home-grown popularity algorithm and here is your answer: After looking at various proprietary ranking sources like Alexa and SEOmoz, we are currently evaluating the Wikipedia page count statistics data available at algorithm. Hope this helps. PS: As always, we very warmly welcome your feedback reg. www.AboutThisDay.com. Please also show your support by liking and tweeting us in social media. Cheers, Kavi On Sat, Oct 6, 2012 at 9:03 PM, Stefan Seefeld < > wrote: uOn 10/06/2012 05:42 PM, AboutThisDay wrote: Good idea, thanks. Good. What about the front-end logic itself ? Is that Free Software, too ? (I'm asking because my intent is to develop a tool that everyone can further refine. It may be used from a website, but it may also be embedded in other applications (or be distributed as a smartphone app, say). I hear you. :-) Right, that is exactly what I'd like to do, too. Hence my earlier question whether that data is accessible from the existing dbpedia.org endpoint. As I'm not interested in hosting my own dbpedia clone / website, but rather in providing a graphical frontend to the existing data, I'd like to see this added to dbpedia.org/sparql. For prototyping purposes I can certainly live with a local clone, though. It certainly does, thanks ! I will, once I learn more about the project itself. Best regards, Stefan uI am afraid not. Its just the data that is open atm. can further refine. Makes sense to me. Will see what others say Cheers, Kavi On Sat, Oct 6, 2012 at 11:18 PM, Stefan Seefeld < > wrote: uLet me share a bit of what I know about the page counts because I’ve been evaluating these as a subjective importance score too. It’s about 2 TB of data and I’ve been working with a slow connection so I need to work with samples of this data, not the whole thing. I tried sampling a week worth of data and the results were a disaster. In the first week of August, Michael Phelps was the most important person in the world. Maybe that was true. But it’s not a good answer for a score that’s valid for all time. It’s clear that the “prior distribution of concepts” that people look up in Wikipedia is highly time dependent and that’s probably also true for other prior distributions. So I grabbed about 50GB of data randomly sampled and got results that are better but still have a strong recency bias; if you look at some movie, like “The Avengers”, you see that interest in the movie picks up when people read the hype, is high when the movie is in theaters, and then falls off to some long term trend. With 5 years of data we can see the peak of the Avengers, but we don’t see the peak of the 1977 Star Wars movie, so there’s an unfair bias towards the Avengers and against Star Wars. When you look at a bunch of things associated with the same time (“writers who were active in 1920”) then the recency bias will be less obnoxious. It probably still hurts results for in the same way variable document lengths caused so much trouble in TREC until Okapi got invented. Anyhow I am processing this data on a Hadoop cluster in my house, as I get more data in my sample I can ask more specific questions and get more accurate answers. The data could also be moved into the AWS cloud and processed with Elastic/Map Reduce In my case this is just a matter of pointing Pig at a different server. I wrote to the people who run the public data sets in AWS about the possibility of getting the page counts hosted in AWS but I haven’t heard back from them yet. If more people ask perhaps they’ll take some action on this. I’ve done some other work that uses link-based importance scores and pretty obviously there are different things you could try there but it was hard to make a real project out of it because I didn’t have an evaluation set. If you try to predict the popularity-based scores based on links based score that would at least give some basis for saying one kind of link-based score is better than another. I’d do it but I have a lot of other projects that are more urgent if less interesting. Anyway if anybody writes back I could make a turtle file with time-averaged popularity-based importance scores :Larry_King :importance 0.00003287 . this would be about 4 million triples. It would be great if I could get these hosted at the DBpedia site or otherwise I could host it on my site. Let me share a bit of what I know about the page counts because I’ve been evaluating these as a subjective importance score too. It’s about 2 TB of data and I’ve been working with a slow connection so I need to work with samples of this data, not the whole thing. I tried sampling a week worth of data and the results were a disaster. In the first week of August, Michael Phelps was the most important person in the world. Maybe that was true. But it’s not a good answer for a score that’s valid for all time. It’s clear that the “prior distribution of concepts” that people look up in Wikipedia is highly time dependent and that’s probably also true for other prior distributions. So I grabbed about 50GB of data randomly sampled and got results that are better but still have a strong recency bias; if you look at some movie, like “The Avengers”, you see that interest in the movie picks up when people read the hype, is high when the movie is in theaters, and then falls off to some long term trend. With 5 years of data we can see the peak of the Avengers, but we don’t see the peak of the 1977 Star Wars movie, so there’s an unfair bias towards the Avengers and against Star Wars. When you look at a bunch of things associated with the same time (“writers who were active in 1920”) then the recency bias will be less obnoxious. It probably still hurts results for in the same way variable document lengths caused so much trouble in TREC until Okapi got invented. Anyhow I am processing this data on a Hadoop cluster in my house, as I get more data in my sample I can ask more specific questions and get more accurate answers. The data could also be moved into the AWS cloud and processed with Elastic/Map Reduce In my case this is just a matter of pointing Pig at a different server. I wrote to the people who run the public data sets in AWS AWS but I haven’t heard back from them yet. If more people ask perhaps they’ll take some action on this. I’ve done some other work that uses link-based importance scores and pretty obviously there are different things you could try there but it was hard to make a real project out of it because I didn’t have an evaluation set. If you try to predict the popularity-based scores based on links based score that would at least give some basis for saying one kind of link-based score is better than another. I’d do it but I have a lot of other projects that are more urgent if less interesting. Anyway if anybody writes back I could make a turtle file with time-averaged popularity-based importance scores :Larry_King :importance 0.00003287 . this would be about 4 million triples. It would be great if I could get these hosted at the DBpedia site or otherwise I could host it on my site. u0€ *†H†÷  €0€1 0 + uOn 10/10/2012 05:31 PM, wrote: Great, thanks ! OK, I think I don't quite understand what \"page counts\" really measure, as I hadn't expected this metric to take up so much extra space, and neither that it was so volatile. My impression was that this was a count of links from other wikipedia pages (or dbpedia entities) to the entity in question, which gives a rather non-subjective measure of importance. (Well, \"importance\" may still be a questionable interpretation, but still, not quite as bad as a popularity contest.) Am I misunderstanding what page counts measure ? Why take they up so much memory ? I'd think this could be compressed into (roughly) one number per entity. Or are all the link origins tracked, too ? (That could be useful, for example if the links are to be weighted according to the search criteria. But that definitely make things rather complex) Thanks, Stefan uThose counts to which you refer are \"incoming page links\" and the ones that Paul mentions are \"page views\". The latter are aggregated by time period so there are many numbers for one page even if you aggregate it into one number per week (using Paul's example). I think Paul meant that it is difficult to find a good period to sample so that you'd have one number per page that reflects whatever notion of importance you may have in mind. On Oct 11, 2012 1:01 AM, \"Stefan Seefeld\" < > wrote: uOn 10/11/2012 02:37 AM, Pablo N. Mendes wrote: Oh, now I see ! So \"page count\" is quite an ambiguous term, then, as it could stand for links as well as views, which clearly are vastly different things. Indeed, views are very volatile, and for that reason alone probably not a good metric, at least for the domain of data I'm interested in. Thanks, Stefan uFor a few years I've been using counts of inlinks from the \"Wikipedia Pagelinks\" as a subjective importance measure. I'm now transitioning to something based on pageview statistics (page counts). I say \"subjective\" because this is one of those things where there isn't necessarily a right answer, something can be better or worse, but you can't specify everything" "dbpintegrator: add / delete order" "uHi Mohamed, all, Besides the unexpected counter reset (see other mail), I have another question. In the dbpintegrator code, the added triples and deleted triples are downloaded and submitted to Virtuoso, in that order. However, doesn't it make more sense to first delete the triples and then add the new ones? As the code is now, newly added triples can be deleted immediately Thanks, Karel uHi Karel, On 09/21/2011 12:02 PM, karel braeckman wrote: You are right, I'll fix that issue and thanks for pointing it out." "Table Extractor project" "uDear DBpedia Community, My project, theTable Extractor, has been chosen for the next GSoC. Let me introduce myself: I am Simone Papalini, a master’s student of Computer Science from Università Politecnica delle Marche, Italy. I discovered the world of linked data (and DBpedia itself) only recently, but I have been fascinated from then. So, I am really enthusiastic about this project. Here you have a little abstract describing the idea behind Table Extractor: “Wikipedia is full of data hidden in tables. The aim of this project is to explore the possibilities of exploiting all the data represented with the appearance of tables in Wiki pages, in order to populate the different versions of DBpedia through new data of interest. The Table Extractor has to be the engine of this data “revolution”: it would achieve the final purpose of extracting the semi structured data from all those tables now scattered in most of the Wiki pages.” I am already doing some sort of statistics trying to understand which data scope would be suitable to start with. Any kind of advice is really appreciated! Some useful links to follow my work: The Table Extractor [proposal] Git Repo Progress log Regards, Simone Papalini Dear DBpedia Community, My project, theTable Extractor, has been chosen for the next GSoC. Let me introduce myself: I am Simone Papalini, a master’s student of Computer Science from Università Politecnica delle Marche, Italy. I discovered the world of linked data (and DBpedia itself) only recently, but I have been fascinated from then. So, I am really enthusiastic about this project. Here you have a little abstract describing the idea behind Table Extractor: “Wikipedia is full of data hidden in tables. The aim of this project is to explore the possibilities of exploiting all the data represented with the appearance of tables in Wiki pages, in order to populate the different versions of DBpedia through new data of interest. The Table Extractor has to be the engine of this data “revolution”: it would achieve the final purpose of extracting the semi structured data from all those tables now scattered in most of the Wiki pages.” I am already doing some sort of statistics trying to understand which data scope would be suitable to start with. Any kind of advice is really appreciated! Some useful links to follow my work: The Table Extractor [proposal] Papalini uThanks Simone for sharing and welcome again to the community! Cheers, On 5/10/16 11:07, Simone Papalini wrote:" "Issues about the DBpedia as Tables data" "uDear Bo, Thank you for your interest! ->However, we find some interesting issues with the table data. We focus mainly one the Place ontology and its data. For example, in the Place csv file, there is no instance uri \" You are right, -> For some of the subclasses (for example All of the tables that are not available for download are actually empty, i.e. there are no instances of those classes in DBpedia. You can confirm that for the class AmusementParkAttraction by running the following query against the DBpedia SPARQL endpoint: select count (distinct ?s) where {?s a } -> We also found that the class hierarchy listed by dbpedia ontology ( The representation is different because the Regards, Petar From: Bo Yan [mailto: ] Sent: Thursday, May 22, 2014 8:27 AM To: ; Cc: Yingjie Hu; 高松 Subject: Issues about the DBpedia as Tables data Dear Mr. Ristoski, We are the STKO research group from the Department of Geography at the University of California Santa Barbara. Recently, we are doing some research using the dbpedia data and we found that you provided dbpedia data as tables. These table data are very good and make our data processing a lot easier. However, we find some interesting issues with the table data. We focus mainly one the Place ontology and its data. For example, in the Place csv file, there is no instance uri \" For some of the subclasses (for example We also found that the class hierarchy listed by dbpedia ontology ( We would appreciate your effort in helping us with these issues. Thank you very much for your time. Best, Bo Yan​ Bo Yan MA/PhD Student Space and Time Knowledge Organization Lab (STKO) Dept. of Geography University of California, Santa Barbara" "Dutch Language mapping?" "uI am trying to set up dbpedia-spotlight for the Dutch language. There are some datasets available for Dutch (nl), but I expect to at least need the dbpedia \"disambiguation\" dataset, which is not available for download. After setting up extraction_framework for \"nl\" and doing : editing extraction.properties: reoving all extractors.except: extractors.nl =MappingExtractor,org.dbpedia.extraction.mappings.DisambiguationExtractor,HomepageExtractor,ImageExtractor,\ InterLanguageLinksExtractor) $ cd dump;mvn scala:run I get an error message: INFO: Mappings loaded (nl) java.lang.reflect.InvocationTargetException Caused by: java.util.NoSuchElementException: key not found: nl at scala.collection.MapLike$class.default(MapLike.scala:225) at scala.collection.immutable.HashMap.default(HashMap.scala:38) at scala.collection.MapLike$class.apply(MapLike.scala:135) at scala.collection.immutable.HashMap.apply(HashMap.scala:38) at org.dbpedia.extraction.mappings.DisambiguationExtractor. (DisambiguationExtractor.scala:22) I assume this means that some classes have not been implemented for \"nl\"? If so, I would like to know if such an effort is on the way or whether it would be feasible for me to give it a try? Is there some pointer/documentation on how to get started? Thanks, Lourens DETAILS OF WHAT I DID I managed to install the extraction_framework. The documentation seems a bit out of date though so it could be I did things wrong. I managed to download the \"nl\" wikipedia input by editing dump.properties and doing $ cd dump; mvn scala:run -Dlauncher=download Extraction started when commenting out all other \"extraction.=\" entries in dump/extraction.properties leaving only extractors.nl =MappingExtractor and running $ mvn scala:run The output indicates that extraction proceeds nicely. But I expect that in the result the \"disambiguation\" result will be missing. When I replace extractors.nl (analoguous to other languages): extractors.nl =MappingExtractor,org.dbpedia.extraction.mappings.DisambiguationExtractor,HomepageExtractor,ImageExtractor,\ InterLanguageLinksExtractor I get error messages mentioned above. I am trying to set up dbpedia-spotlight for the Dutch language. There are some datasets available for Dutch (nl), but I expect to at least need the dbpedia 'disambiguation' dataset, which is not available for download. After setting up extraction_framework for 'nl' and doing : editing extraction.properties: reoving all extractors.except: extractors.nl =MappingExtractor,org.dbpedia.extraction.mappings.DisambiguationExtractor,HomepageExtractor,ImageExtractor,\ InterLanguageLinksExtractor) $ cd dump;mvn scala:run I get an error message: INFO: Mappings loaded (nl) java.lang.reflect.InvocationTargetException Caused by: java.util.NoSuchElementException: key not found: nl at scala.collection.MapLike$class.default(MapLike.scala:225) at scala.collection.immutable.HashMap.default(HashMap.scala:38) at scala.collection.MapLike$class.apply(MapLike.scala:135) at scala.collection.immutable.HashMap.apply(HashMap.scala:38) at org.dbpedia.extraction.mappings.DisambiguationExtractor.(DisambiguationExtractor.scala:22) I assume this means that some classes have not been implemented for 'nl'? If so, I would like to know if such an effort is on the way or whether it would be feasible for me to give it a try? Is there some pointer/documentation on how to get started? Thanks, Lourens DETAILS OF WHAT I DID I managed to install the extraction_framework. The documentation seems a bit out of date though so it could be I did things wrong. I managed to download the 'nl' wikipedia input by editing dump.properties and doing $ cd dump; mvn scala:run -Dlauncher=download Extraction started when commenting out all other 'extraction.=' entries in dump/extraction.properties leaving only extractors.nl =MappingExtractor and running $ mvn scala:run The output indicates that extraction proceeds nicely. But I expect that in the result the 'disambiguation' result will be missing. When I replace extractors.nl (analoguous to other languages): extractors.nl =MappingExtractor,org.dbpedia.extraction.mappings.DisambiguationExtractor,HomepageExtractor,ImageExtractor,\ InterLanguageLinksExtractor I get error messages mentioned above. uCan you show us your extraction.properties? I suspect you forgot the line below? languages=nl I am also not sure if you should have fully qualified (org.dbpedia.extraction.mappings.DisambiguationExtractor) or just the class name (DisambiguationExtractor). Cheers, Pablo On Fri, Jun 15, 2012 at 1:43 PM, Meij, L.K. van der < >wrote: uYou should also setup the following file with the 'nl' key configuration core/src/main/scala/org/dbpedia/extraction/config/mappings/DisambiguationExtractorConfig.scala you must define the language-specific token used by Wikipedia in the article title (e.g. \" (disambiguation)\" for the English Wikipedia You could also contribute this back to the framework so as to be included in the next release (patch or mail) Cheers, Dimitris On Fri, Jun 15, 2012 at 5:41 PM, Pablo Mendes < > wrote: uThanks for your suggestions. I do not know whether attachments are allowed, so I paste the extraction.properties below. (apparantly I already tried adding the full path:dbpedia.extraction.mappings.DisambiguationExtractor, but without, the same error message follows). extractors.nl =MappingExtractor,DisambiguationExtractor and extractors.nl =MappingExtractor,org.dbpedia.extraction.mappings.DisambiguationExtractor,HomepageExtractor give the same error. If I only have extractors.nl =MappingExtractor the extraction process takes about 30 minutes and seems to end without problems (I haven't looked at generated outputfiles yet though). (I started without commenting out the other extractors, but then despite having languages=nl, other languages were being textracted) Kind regards, Lourens DETAILS The error I get, again: error Caused by: java.util.NoSuchElementException: key not found: nl at org.dbpedia.extraction.mappings.DisambiguationExtractor. (DisambiguationExtractor.scala:22) extraction.propertiesdir=/home/lourens/spotlight/wikipedia source=pages-articles.xml.bz2 require-download-complete=true languages=nl extractors=ArticleCategoriesExtractor,CategoryLabelExtractor,ExternalLinksExtractor,\ GeoExtractor,InfoboxExtractor,LabelExtractor,PageIdExtractor,PageLinksExtractor,\ RedirectExtractor,RevisionIdExtractor,SkosCategoriesExtractor,WikiPageExtractor extractors.nl =MappingExtractor,org.dbpedia.extraction.mappings.DisambiguationExtractor,HomepageExtractor,ImageExtractor,\ InterLanguageLinksExtractor #extractors.nl=MappingExtractor ontology=/ontology.xml mappings=/mappings uri-policy.uri=uri:en; generic:en; xml-safe-predicates:* uri-policy.iri=generic:en; xml-safe-predicates:* format.nt.gz=n-triples;uri-policy.uri format.nq.gz=n-quads;uri-policy.uri format.ttl.gz=turtle-triples;uri-policy.iri format.tql.gz=turtle-quads;uri-policy.iri On Jun 15, 2012, at 16:41 PM, Pablo Mendes wrote: Can you show us your extraction.properties? I suspect you forgot the line below? languages=nl I am also not sure if you should have fully qualified (org.dbpedia.extraction.mappings.DisambiguationExtractor) or just the class name (DisambiguationExtractor). Cheers, Pablo On Fri, Jun 15, 2012 at 1:43 PM, Meij, L.K. van der < > wrote: I am trying to set up dbpedia-spotlight for the Dutch language. There are some datasets available for Dutch (nl), but I expect to at least need the dbpedia \"disambiguation\" dataset, which is not available for download. After setting up extraction_framework for \"nl\" and doing : editing extraction.properties: reoving all extractors.except: extractors.nl =MappingExtractor,org.dbpedia.extraction.mappings.DisambiguationExtractor,HomepageExtractor,ImageExtractor,\ InterLanguageLinksExtractor) $ cd dump;mvn scala:run I get an error message: INFO: Mappings loaded (nl) java.lang.reflect.InvocationTargetException Caused by: java.util.NoSuchElementException: key not found: nl at scala.collection.MapLike$class.default(MapLike.scala:225) at scala.collection.immutable.HashMap.default(HashMap.scala:38) at scala.collection.MapLike$class.apply(MapLike.scala:135) at scala.collection.immutable.HashMap.apply(HashMap.scala:38) at org.dbpedia.extraction.mappings.DisambiguationExtractor. (DisambiguationExtractor.scala:22) I assume this means that some classes have not been implemented for \"nl\"? If so, I would like to know if such an effort is on the way or whether it would be feasible for me to give it a try? Is there some pointer/documentation on how to get started? Thanks, Lourens DETAILS OF WHAT I DID I managed to install the extraction_framework. The documentation seems a bit out of date though so it could be I did things wrong. I managed to download the \"nl\" wikipedia input by editing dump.properties and doing $ cd dump; mvn scala:run -Dlauncher=download Extraction started when commenting out all other \"extraction.=\" entries in dump/extraction.properties leaving only extractors.nl =MappingExtractor and running $ mvn scala:run The output indicates that extraction proceeds nicely. But I expect that in the result the \"disambiguation\" result will be missing. When I replace extractors.nl (analoguous to other languages): extractors.nl =MappingExtractor,org.dbpedia.extraction.mappings.DisambiguationExtractor,HomepageExtractor,ImageExtractor,\ InterLanguageLinksExtractor I get error messages mentioned above. uHi Lourens, welcome to DBpedia hacking! :-) As others already said, simple class names in extraction.properties are prefixed by \"org.dbpedia.extraction.mappings.\" [1] (the package where all our extractors currently live), so org.dbpedia.extraction.mappings.DisambiguationExtractor and DisambiguationExtractor are equivalent. Finding disambiguation pages is not hard: just look for certain template invocations, for example {{Disambig}}. That's what we do. The problem is that our list of disambiguation templates [3] is outdated. We should get the info from pages like some code that almost does that but didn't have time yet to finish the job [4]. I just updated Disambiguation.scala. To get the DisambiguationExtractor running you'll have to add a line to DisambiguationExtractorConfig.scala [2] - the part of the title that many disambig pages on nl.wp contain. Similar as in the other languages. If there are any other questions, let us know! Cheers, JC [1] [2] [3] [4] On Fri, Jun 15, 2012 at 5:15 PM, Meij, L.K. van der < > wrote:" "Airpedia (was Slovak DBPedia mappings)" "uI too would be interested in more info on Airpedia. What forum/list is used to discuss it? On Tue, Jun 4, 2013 at 6:40 AM, Jona Christopher Sahnwaldt < >wrote: There's a precision/recall graph here: Tom uHi everyone, FYI, the integration of the Airpedia types datasets [1, 2, 3] into DBpedia is one of the first tasks for the GSoC project 'type inference to extend coverage' [4] (see also a related idea [5]). @Tom, there is no official Airpedia mailing list. I think the DBpedia one fits well, you can freely ask here, Alessio (the main guy behind) and me will try to answer to your questions. BTW (just my 2 cents), it would be interesting to investigate if the same approach can be applied to Freebase. Cheers! [1] [2] [3] [4] [5] On 6/4/13 4:15 PM, Tom Morris wrote:" "Warnings from DBpedia 3.0 N-Triples file." "uHi there, I did a checking pass over the DBPedia 3.0 (obtained from The complete log of checking is at: Line numbers in the file referred to DBPedia 3.0, all the files concatenated in alphabetic filename order. To make sure we're all on the same page, I've put that file at: The log file is 114343 warnings but the warning fall into a few specific classes: == Broken URIs Some examples: Very broken: not sure what happened here: Missing port: Use of [] Scrambled: This form is quite common: Multiple fragments: == Literals Use of commas in xsd:integer and xsd:decimals: e.g. \"3,209\"^^ \"5,856.2\"^^ Use of dot in integers: \"1.00\"^^ Strange dates: \"0017-11-70\"^^ == URI warnings : not in preferred forms: More minor: \"lowercase is preferred in host names\" e.g. and \"Uppercase is preferred for % encoding\"" "DBpedia as Tables release 2014" "uDear all, We are happy to announce the 2014 version of the DBpedia as Tables tool [1]. As some of the potential users of DBpedia might not be familiar with the RDF data model and the SPARQL query language, we provide some of the core DBpedia (Release 2014) data in tabular form as Comma-Separated-Values (CSV) files and as JSON files, which can easily be processed using standard tools, such as spreadsheet applications, relational databases or data mining tools. For each class in the DBpedia ontology (such as Person, Radio Station, Ice Hockey Player, or Band) we provide a single CSV/JSON file which contains all instances of this class. Each instance is described by its URI, an English label and a short abstract, the mapping-based infobox data describing the instance (extracted from the English edition of Wikipedia), geo-coordinates, and external links. Altogether we provide 685 CSV/JSON files packed in a single file (3.8 GB compressed). In addition, we also provide separate CSV/JSON files for each class for download. More information about the file format as well as the download link can be found on the DBpedia as Tables Wiki page [1]. Any feedback is welcome! Best regards, Petar and Chris [1] DBpediaAsTables" "public sparql endpoints not working" "uHi, who can take a loot at I got error messages like Virtuoso 08C01 Error CL: Cluster could not connect to host 2 22202 error 111 SPARQL query: define sql:big-data-const 0 #output-format:application/sparql-results+json define input:default-graph-uri PREFIX owl: PREFIX xsd: PREFIX rdfs: PREFIX rdf: PREFIX foaf: PREFIX dc: PREFIX : PREFIX dbpedia2: PREFIX dbpedia: PREFIX skos: PREFIX dbo: SELECT DISTINCT ?x WHERE { ?x a dbo:Bridge . ?x dbo:architectualBureau :Suspension_bridge . } Another problem I often experienced when using text search bif:contains is \"Transaction time out \" even though the queries I issued are not complicated at all. Can we increase the allowed transaction time a little bit? Thanks, Lushan Han uHI Lushan, Fixed, please try again. I will discuss this internally, however you should really use 'anytime' queries on long running queries as discussed in: Patrick" "dbpedia data exports contain data not found in the endpoint" "uHi all, I just joined this forum, and this is my first post.  I posted a question over at: no luck with a response yet.  I am hoping for help here now.  Essentially I am finding a lot of data in the dbpedia data exports that is not in the live endpoints.  See the above post for one example. Another example query comes from the relFinder tool, using one of the examples.  I modified to just return a count, rather than first 10 results. SELECT COUNT(*)  WHERE { ?pf1 ?middle . ?ps1 ?os1 .  ?os1 ?ps2 ?middle .  FILTER ((?pf1 != ) && (?pf1 != ) && (?pf1 != ) && (?pf1 != ) && (?pf1 != ) && (?pf1 != ) && (?pf1 != ) && (?ps1 != ) && (?ps1 != ) && (?ps1 != ) && (?ps1 != ) && (?ps1 != ) && (?ps1 != ) && (?ps1 != ) && (?ps2 != ) && (?ps2 != ) && (?ps2 != ) && (?ps2 != ) && (?ps2 != ) && (?ps2 != ) && (?ps2 != ) && (!isLiteral(?middle)) && (?middle != ) && (?middle != ) && (?middle != ?os1 ) && (!isLiteral(?os1)) && (?os1 != ) && (?os1 != ) && (?os1 != ?middle ) ). } in dbpedia this produces 6 results.  In the data exports, loaded into a triple store, it returns 396 results. Can someone guide me as to what portion of the dbpedia data is loaded into the SPARQL endpoint? Thanks, Tim Hi all, I just joined this forum, and this is my first post. I posted a question over at: . ?os1 ?ps2 ?middle . FILTER ((?pf1 != < < < Tim" "Update the CreateFreebaseLinks based on the new Freebase RDF dump format (#25)" "uHi Jona, thanks for merging the pull request! Anyway, couldn't we use percent encoding for Unicode code points which are not allowed in N-Triples? (namely those outside the [#x20,#7E] range? In this case we should get UTF-8 bytes and percent encode them. For example, as far as I can see Marl$00C3$00ADn$002C_$00C3$0081vila is where \00C3 is 0xC3 0x83 \00AD is 0xC2 0xAD \0081 is 0xC2 0x81 WDYT? Cheers Andrea 2013/3/22 Christopher Sahnwaldt < > Hi Jona, thanks for merging the pull request! Anyway, couldn't we use percent encoding for Unicode code points which are not allowed in N-Triples? (namely those outside the [#x20,#7E] range? In this case we should get UTF-8 bytes and percent encode them. For example, as far as I can see Marl$00C3$00ADn$002C_$00C3$0081vila is < . uOn 22 March 2013 23:21, Andrea Di Menna < > wrote: I prefer the \"garbage in, garbage out\" style. The freebase keys are broken. We could try to fix them, but we would have to use several different heuristics: with percent encoding, we could \"fix\" the keys that are UTF-8 encoded, but not the ones that are Windows-encoded. To fix the keys containing whitespace, we would first have to UTF-8 the Unicode code point, then percent encode the UTF-8it's a mess. And anyway, we try to move towards IRIs, not URIs, and IRIs wouldn't contain percent-encodings for these characters. How many keys are affected anyway? I think we generate several million freebase links, so even if 100,000 freebase keys are broken, it's not a big problem. JC uPS: or in this case, \"garbage in, nothing out\" uOn 22 March 2013 23:21, Andrea Di Menna < > wrote: Oh, by the way, it would be UTF-8-percent-encoding for Marlín,_Ávila. The weird thing is that these Wikipedia page titles in the Freebase contain UTF-8-encoded characters when they should contain no encoding at all, just plain Unicode code points. (Of course, the characters and codepoints are also dollar-escaped as usual for Freebase, but that's not a problem.) JC uCan someone point to the part of the discussion which talks about what the problem is? This thread seems to start in mid-stream Freebase's MQL key encoding ( is a completely private encoding which shouldn't have any effect on external URIs/IRIs/references/etc On Sun, Mar 24, 2013 at 9:44 PM, Jona Christopher Sahnwaldt < uOn Mar 25, 2013 3:32 AM, \"Tom Morris\" < > wrote: the problem is? This thread seems to start in mid-stream That's right. Sorry. The start of the thread is in the middle of this page: encoding which shouldn't have any effect on external URIs/IRIs/references/etc That's correct, and that's how the Scala script has always worked: it unescapes the MQL keys and uses the result to form DBpedia IRIs. The problems arise because some MQL keys contain invalid escapes (UTF-8 and Windows-1252 bytes instead of Unicode code points), and some others contain whitespace like U+2003 that is invalid even in IRIs. I would guess though that it's not a big problem because the affected keys are 1. not many, i.e. <1% and 2. not relevant anyway because they do not represent valid, current, non-redirect Wikipedia page titles. That's just a guess though, based on only a very cursory look at a few bad keys. I don't remember if these problems also came up when I ran the script on the old freebase dump format. JC > wrote: are actually Freebase our uHi, Maybe the only thing that can be done is to notify the freebase discussion list about this problem. Agree with Jona that the number of problematic references is not relevant. Cheers Andrea 2013/3/25 Jona Christopher Sahnwaldt < > uHi all, it looks like there are actually some pages in Wikipedia which contain wrong data, which is where the pages originate from in Freebase, e.g. This page has been deleted on Jan 21, and this actually lead to the Freebase key Marl$00C3$00ADn$002C_$00C3$0081vila since UTF-8 0xC3 0x83 -> Unicode U+00C3 , etc Cheers Andrea 2013/3/25 Andrea Di Menna < > uI wouldn't claim that Freebase is bug-free, but that's a quite old and simple algorithm, so unless they're triples from very early in it's life (say, 2007), I'd guess that bad input data from Wikipedia is more likely than a problem with the transformation. It might help to give a little background on how Freebase deals with these links. The canonical link uses the article number (in the namespace /wikipedia/en_id), but the alpha title (MQL key escaped) *and all redirects* are also stored (namespace /wikipedia/en). Additionally, the same information has recently been added for number of the other language wikipedias. You can see them all here for the example that Andrea mentioned: Outbound links from Freebase to Wikipedia are made using the article number, so that's really the most important link. The wisdom of including redirects is debatable, I think. Sometimes they're good alternate names, but other times they represent misspellings, related concepts, etc. If DBpedia has the Wikipedia article number, I'd suggest creating the links based on those. If not, I'd suggest using the redirect file to canoncialize on a single \"best\" link. Tom On Mon, Mar 25, 2013 at 6:41 AM, Andrea Di Menna < > wrote: uAnother approach might be to use the recently introduced Topic Equivalent Webpage property: ns:m.09q3rp ns:common.topic.topic_equivalent_webpage < ns:m.09q3rp ns:common.topic.topic_equivalent_webpage < ns:m.09q3rp ns:common.topic.topic_equivalent_webpage < ns:m.09q3rp ns:common.topic.topic_equivalent_webpage < It appears to be a single canonical alpha link for each language Wikipedia with the MQL escaping undone and the redirects resolved. Tom On Mon, Mar 25, 2013 at 9:18 AM, Tom Morris < > wrote: uHi, we have article numeric ids in the quads file (as oldid parameter). Jona, do you think this is worth giving a try? Regards Andrea 2013/3/25 Tom Morris < > uSorry, wrong information. We should use Page Ids ( I am going to try something. Cheers Andrea 2013/3/25 Andrea Di Menna < > uOn 25 March 2013 14:18, Tom Morris < > wrote: Our script works like this: - load all wikipedia page titles in the main namespace into a set (i.e. no categories, templates, etc.) - subtract from that set all titles that some other DBpedia code (which probably has 98-99% precision) recognized as redirect or disambiguation pages - create links to Freebase only for titles that are in the result set, i.e. that are very likely content pages Before we can look up a Freebase title in the result set, we generate the equivalent DBpedia IRI for it. This transformation fails for the Freebase keys we're discussing here, but I surmise this only happens for titles that neither Freebase, DBpedia nor Wikipedia really use, so it's not a real problem. Cheers, JC uOn 25 March 2013 15:00, Tom Morris < > wrote: Sounds good! I think we would only have to change a few lines in our script to use these instead. uHi Andrea, Wikipedia page ids (URL parameter curid) are more stable than page titles, and according to Tom, Freebase uses them as the main links to Wikipedia, but DBpedia still uses the current page title as the canonical resource IRI, so the DBpedia-to-Freebase linkset has to use the page title. I assume Freebase also uses the current page title in triples like ns:m.09q3rp ns:common.topic.topic_equivalent_webpage . so I think we should simply use these lines. Of course, this will fail for the few Wikipedia page titles that changed between the time Freebase generates their links and DBpedia extracts its data, but that's no big deal. We have bigger fish to fry. :-) To make our Freebase script use the article id, you'd have to load page_ids_en.nt.bz2 , build a map from ids to titles, look for ids in the Freebase dumps, map them to titlesdoable, but a lot of work Cheers, JC On 25 March 2013 15:42, Andrea Di Menna < > wrote:" "Mapping problem for class Film" "uHi there, Anyone knows why the *Titanic_(1997_film)* entry in DBpedia is missing the list of actors? (What's Titanic without Leonardo and Kate! ;) Original page on Wikipedia: => Corresponding resource page on DBpedia 3.8: => Corresponding resource on DBpedia Live (latest data and code): => The Wikipedia page has the cast information (i.e. starring) in its infobox and uses the Infobox_film template, for which a DBpedia mapping exists: Any clue? Nicolas. uHi Nicolas, As far as I can see the actors in the starring infobox property are enclosed in a Plainlist template. Maybe the extractor does not expect to find a template there? (Question for the dbpedia devs) Cheers Andrea Il giorno 15/gen/2013 18:36, \"Nicolas Torzec\" < > ha scritto: uYep that was my clue too: a \"plain list\" template is used for listing the actors, instead of a comma-separated list or a newline-separated list. Anyone? -N On Jan 15, 2013, at 10:50 AM, Andrea Di Menna < > wrote: uHi all, this is true, the framework currently does not handle list values embedded in templates. DBpedia is open source and you are of course welcome to help us fix this issue. You can help directly with code of course ;) or with a more proper documentation of the problem, perhaps gather other similar templates and use cases in order to have a better idea on how to fix this globally Best, Dimitris On Tue, Jan 15, 2013 at 9:53 PM, Nicolas Torzec < >wrote:" "DBPedia Live Updates?" "uHi, I'm really looking forward to seeing dbpedia live be released, it'll be fantastic to be able to fix up wikipedia infoboxes and see the data percolate through in (near) real-time! I'm curious about how updates to the dbpedia data will be pushed out. There are multiple use cases for applications wanting to tap into the feed of updates, so I'm wondering if there a protocol or mechanism that is (or could be) documented to support pushing out of notifications of changes to resources? I presume there will continue to be regular dumps of the whole dbpedia datasets, but I'm wondering whether we can now expect to get more fine-grained updates too? Cheers, L. uLeigh Dodds wrote: I've to admit that so far we concentrated on implementing the live updates generally, but have not yet thought much about how to making the updates also available to third parties. However, once DBpedia-Live is running smoothly it will be relatively easy to create nightly dumps and we will also try to publish some kind of diffs or update logs in smaller increments, or even real-time. One posibility is to publish the updates as linked data itself - e.g. as suggested in our Triplify paper [1]. Anyway, it will probably aways be a pull mechanism instead of push, since there are lots of updates always - so you can be sure to get something new when you pull in updates. Push from my point of view only makes sense, when updates are rare. Summing up: Its definitely on our agenda, but for now we have to focus our (unfortunately limited) engineering resources on stabilizing the live updates in general for now. uHi Sören, 2009/11/5 Sören Auer < >: Great. I'm not sure I agree there. There's been some interesting work recently (on the web in general, outside of semweb circles) on push based protocols that use some form of federation and distribution for scaling, E.g. PubSubHubbub, or XMPP. These might usefully be used in this context. Great, thanks for the update, good to know its on your radar! Cheers, L. uLeigh Dodds wrote: Leigh, Atom Publishing, PubSubHubbub, or XMPP, all of these are options once we get this Live instance done. This has been a pretty complex undertaking bearing in mind we are reading, deleting, writing huge volumes of data. \"Data as a Service\" must have a delta exchange mechanism, long term. We are all in agreement here, and it will happen :-) uHi, 2009/11/5 Kingsley Idehen < >: Understood. This is an area I'm very interested in, so it would be useful to discuss these mechanisms in a wider context once work is ready to proceed. Cheers, L. uLeigh Dodds wrote: Absolutely! There is no preconceived design in place right now. There are many ways to skin this rat, so open dialog can start in earnest once DBpedia Live is out (should be very soon). We should use this forum to discuss and develop the solution etc" "Dbpedia Lookup Exact String Matches Don't Appear First In Results" "uTrying to resolve place names to URIs via the dbpedia lookup web-service. What I'm seeing is that less-specific results are showing up earlier in the order of results than exact ones. e.g sands$ ./dbpedia_lookup_place.rb \"Massachusetts\" Boston Massachusetts Cambridge, Massachusetts Worcester, Massachusetts Springfield, Massachusetts I of course want the most general of these, which is typically the one with the exact name, though Boston comes up before Massachusetts. It appears that instead, the results are ordered based on refCount, which seems an odd choice for a lookup service. I can of course write code to iterate through the labels and pick the one that's exact, but it seems like unnecessary work for the consumer of the data. What is the best recourse here? Another example: sands$ ./dbpedia_lookup_place.rb \"Alberta\" Calgary Edmonton Lethbridge Red Deer, Alberta http://dbpedia.org/resource/Red_Deer,_Alberta Moose Jaw http://dbpedia.org/resource/Moose_Jaw This is even more unfortunate because this URI (the exact resource) doesn't even appear: http://en.wikipedia.org/wiki/Alberta Cheers, - Sands Fish - MIT Libraries - Data Scientist / Software Engineer - / @sandsfish Trying to resolve place names to URIs via the dbpedia lookup web-service. What I'm seeing is that less-specific results are showing up earlier in the order of results than exact ones. e.g sands$ ./dbpedia_lookup_place.rb 'Massachusetts' Boston http://dbpedia.org/resource/Boston * Massachusetts http://dbpedia.org/resource/Massachusetts Cambridge, Massachusetts http://dbpedia.org/resource/Cambridge,_Massachusetts Worcester, Massachusetts http://dbpedia.org/resource/Worcester,_Massachusetts Springfield, Massachusetts http://dbpedia.org/resource/Springfield,_Massachusett s I of course want the most general of these, which is typically the one with the exact name, though Boston comes up before Massachusetts. It appears that instead, the results are ordered based on refCount, which seems an odd choice for a lookup service. I can of course write code to iterate through the labels and pick the one that's exact, but it seems like unnecessary work for the consumer of the data. What is the best recourse here? Another example: sands$ ./dbpedia_lookup_place.rb 'Alberta' Calgary http://dbpedia.org/resource/Calgary Edmonton http://dbpedia.org/resource/Edmonton Lethbridge http://dbpedia.org/resource/Lethbridge Red Deer, Alberta http://dbpedia.org/resource/Red_Deer,_Alberta Moose Jaw http://dbpedia.org/resource/Moose_Jaw This is even more unfortunate because this URI (the exact resource) doesn't even appear: http://en.wikipedia.org/wiki/Alberta Cheers, - Sands Fish - MIT Libraries - Data Scientist / Software Engineer - / @sandsfish uAnother example, where ontology coding may be at fault, haven't dug that deep and dbpedia.org/resource pages seem to be down for maint. at the moment. doesn't show up at all for: sands$ ./dbpedia_lookup_place.rb \"Bosporus\" Bosphorus Bridge Marmaray From: Sands Alden Fish [ ] Sent: Monday, November 11, 2013 6:26 PM To: Subject: [Dbpedia-discussion] Dbpedia Lookup Exact String Matches Don't Appear First In Results Trying to resolve place names to URIs via the dbpedia lookup web-service. What I'm seeing is that less-specific results are showing up earlier in the order of results than exact ones. e.g sands$ ./dbpedia_lookup_place.rb \"Massachusetts\" Boston Massachusetts Cambridge, Massachusetts Worcester, Massachusetts Springfield, Massachusetts I of course want the most general of these, which is typically the one with the exact name, though Boston comes up before Massachusetts. It appears that instead, the results are ordered based on refCount, which seems an odd choice for a lookup service. I can of course write code to iterate through the labels and pick the one that's exact, but it seems like unnecessary work for the consumer of the data. What is the best recourse here? Another example: sands$ ./dbpedia_lookup_place.rb \"Alberta\" Calgary http://dbpedia.org/resource/Calgary Edmonton http://dbpedia.org/resource/Edmonton Lethbridge http://dbpedia.org/resource/Lethbridge Red Deer, Alberta http://dbpedia.org/resource/Red_Deer,_Alberta Moose Jaw http://dbpedia.org/resource/Moose_Jaw This is even more unfortunate because this URI (the exact resource) doesn't even appear: http://en.wikipedia.org/wiki/Alberta Cheers, - Sands Fish - MIT Libraries - Data Scientist / Software Engineer - / @sandsfish Another example, where ontology coding may be at fault, haven't dug that deep and dbpedia.org/resource pages seem to be down for maint. at the moment. http://en.wikipedia.org/wiki/Bosphorus doesn't show up at all for: sands$ ./dbpedia_lookup_place.rb 'Bosporus' Bosphorus Bridge http://dbpedia.org/resource/Bosphorus_Bridge Marmaray http://dbpedia.org/resource/Marmaray From: Sands Alden Fish [ ] Sent: Monday, November 11, 2013 6:26 PM To: Subject: [Dbpedia-discussion] Dbpedia Lookup Exact String Matches Don't Appear First In Results Trying to resolve place names to URIs via the dbpedia lookup web-service. What I'm seeing is that less-specific results are showing up earlier in the order of results than exact ones. e.g sands$ ./dbpedia_lookup_place.rb 'Massachusetts' Boston http://dbpedia.org/resource/Boston * Massachusetts http://dbpedia.org/resource/Massachusetts Cambridge, Massachusetts http://dbpedia.org/resource/Cambridge,_Massachusetts Worcester, Massachusetts http://dbpedia.org/resource/Worcester,_Massachusetts Springfield, Massachusetts http://dbpedia.org/resource/Springfield,_Massachusett s I of course want the most general of these, which is typically the one with the exact name, though Boston comes up before Massachusetts. It appears that instead, the results are ordered based on refCount, which seems an odd choice for a lookup service. I can of course write code to iterate through the labels and pick the one that's exact, but it seems like unnecessary work for the consumer of the data. What is the best recourse here? Another example: sands$ ./dbpedia_lookup_place.rb 'Alberta' Calgary http://dbpedia.org/resource/Calgary Edmonton http://dbpedia.org/resource/Edmonton Lethbridge http://dbpedia.org/resource/Lethbridge Red Deer, Alberta http://dbpedia.org/resource/Red_Deer,_Alberta Moose Jaw http://dbpedia.org/resource/Moose_Jaw This is even more unfortunate because this URI (the exact resource) doesn't even appear: http://en.wikipedia.org/wiki/Alberta Cheers, - Sands Fish - MIT Libraries - Data Scientist / Software Engineer - / @sandsfish" "Announcement: DBpedia Faceted Browser and DBpedia User Script released" "uHi all, We are pleased to announce the release of the DBpedia Faceted Browser [1] by Jona Christopher Sahnwaldt as well as the DBpedia User Script [2] by Anja Jentzsch. The DBpedia Faceted Browser allows you to explore Wikipedia via a faceted browsing interface. It supports keyword queries and offers relevant facets to narrow down search results, based on the DBpedia Ontology. In this manner, queries such as \"recent films about Buenos Aires\" can be easily and intuitively posed against DBpedia. The DBpedia Faceted browser was developed in cooperation with the search engine company Neofonie, which also kindly provided the funding for this project. The DBpedia User Script is a Greasemonkey script that enhances Wikipedia pages with a link to their corresponding DBpedia page and can be used within Firefox, Safari and Opera with a suitable Greasemonkey plugin. Cheers, Christian Becker [1] [2] DBpediaUserScript uChristian Becker wrote: I am trying to compare this new Faceted Browsing approach to say: miniaturized DBpedia logo based icon on Wikipedia pages, but when I click, I end up with the standard green DBpedia HTML+RDFa page. What step am I missing? Kingsley uHi Kingsley, I think what makes the new Faceted Browser stand out is its use of the DBpedia ontology for facet selection, which makes it really intuitive to use. Plus, it's built for DBpedia and thus makes use of redirects, link counts etc. for better result sorting. I think that's about all it does :) Cheers, Christian uChristian Becker wrote: If I go to see anything different from what I would see if I started from Wikipedia, by clicking the icon? So I assume its a shortcut from Wikipedia HTML page to DBpedia's HTML+RDFa page, in a nutshell? If the above is true, here lies my confusion: why hasn't the DBpedia HTML page been described as a Faceted Browsing Interface until now? Kingsley uHi Kingsley, maybe I should have written the announcement more clearly - these are two completely separate applications that we simply announced together, i.e. the DBpedia User Script, which adds the icon, has nothing to do with the Faceted Browser. The Faceted Browser is here: Cheers, Christian On Sep 22, 2009, at 7:14 PM, Kingsley Idehen wrote: uChristian Becker wrote: You really didn't make it clear in the initial announcement. Much better, nice job! Certainly a plus on the intuitive side relative to: put this face on top of the /fct endpoint (REST or SOAP based service) and then leverage the sophistication it offers. Examples: 1. optional sameAs and IFP based data expansion/explosion/smushing/meshing 2. anytime query feature - ability to retry within configurable response time while horizontally partitioned aggregation works in background 3. ability let the user choose the ontology that drives the faceted navigation rather than being confined to the DBpedia ontology 4. facet labels should be linkeddata URIs too i.e. enable faceted view of the TBox realm Kingsley uOn Tue, Sep 22, 2009 at 9:19 AM, Christian Becker < > wrote: DBPedia is always so slow. I clicked a couple of links in this new browser and the page I ended up at was just hanging. As we add more data to the semantic web the number of links between resources is going to scale exponentially. I wonder if that scaling is going to be faster than the availability of cheaper, faster computers. It is a deluge of data. Can DBPedia/semantic web be more than an academic idea? $ time wget Elwell_Stephen_Otis uBrian wrote: uChristian Becker wrote: +1 That Faceted Browser is just awesome, and Christian's comment points out why I like the DBpedia ontology: it's a simple ontology that reflects the way people think about \"things\"; by avoiding the areas that are hard and sticking to concrete things, it delivers a lot of value. I can certainly think of specific things I'd like it to do differently, but it's a very different situations from UMBEL or YAGO, whose sheer mass makes them difficult to understand and use (see Exhibit me on to YAGO, but I want to punch the wall when I look at the .nt dumps and see categories like (slight exaggeration) \"HungarianAmericanComedicLeftHandedComedicActorsWhoWereBornOnTuesdayNeverEatEggsForBreakfastWhoHaveWonDragRaces\" uPaul Houle wrote: Paul, Kingsley shouldn't push you into anything :-) You should use what works best for your use case etc The option to use SUMO, OpenCyc, Yago, UMBEL etcshould remain as options. Christian: Is there any effort to map DBpedia to any of the above, once done the problem is solved as Virtuoso's Reasoner will handle the rest of the work. Is this effort open source? uKingsley Idehen wrote: Of course, they are there, but I haven't found any of those four all that exciting. Something I do find more useful in the immediate term is to bring in specialized heavyweight taxonomies & databases. For instance, ITIS, Mesh, PubChem, ATCCS, etc. For instance, if you look at an entry like this in wikipedia, you see that drugs (for instance) are really well connected to high-quality identifiers and categories. It's not a difficult project to bring those out and mesh them a little better with the DBpedia ontology. Conversely, it doesn't make a sense to put a lot of effort into classifying drugs or living species in much more detail because the job is already done. (Unless you're ~really~ into the details: I've done some work with ITIS because it's logically consistent, but there isn't 100% agreement between taxonomists about everything. I looked at about 20 cases where Wikipedia and ITIS disagreed about some detail came to the conclusion that Wikipedia had a more modern viewpoint about 80% of the time) Another interesting directions is to clearly identify \"genetic categories\" (Mallard Duck, Honda Civic) as related things and keep a clear distinction between \"members of a generic category\" and the \"generic category\" itself. For instance, it's true in some sense that any Person is a Eukaryote, but that's not a true relationship between a Person and Eukaryote in the Dbpedia ontology However, ought to be a Eukaryote. I'd really like to see the ultralightweight approach of the DBpedia ontology extended to things that are less concrete, things like \"concepts\", \"inventions\", \"products\", etc. There are quite a few really great types in Freebase such as \"toplevel domain\" that are concrete and easy to model that aren't currently available in DBPedia." "Odp: DBpedia ontology - predicate constraints" "uHi Jona, thanks, yes I am aware of the mapping wiki, in fact I have editor rights. I was wondering why the values are missing, so the simple answer is that so far nobody provided them. WRT to changing the domain - I think that this is not the best idea. Hometown seems to be a predicate designed specifically for people. From my POV there is an error in the mappings of \"Music band infobox\" (or sth similar) that wrongly maps English \"origin\" of the band to \"dbpedia-ontology:hometown\". I think that first we should define the constraints, then check for inconsistencies and finally fix the mappings. Correct me if I'm wrong. Cheers, Aleksander uHi Pablo, well, for me this seems a very interesting opinion, but I am pretty confused. First of all I guess these opinions are not equal in their support. Why I think so? Because when we drop argument constraints (I mean, we do not apply them if they are defined) we no longer follow the mathematical semantics theory, which I believe is the ground for SemanticWeb, and we adopt natural language semantics theory, with metaphor and other phenomena which are pretty hard to model using computers. Let me give you an example: there is a triple in DBpedia regarding Berlin - it is the \"country\" of e.g. City_Slang (a record label company). How can we interpret this \"fact\"? \"country\" has a \"Country\" constraint on its range and Berlin definitely is not a country. For me it's really hard to accept an interpretation in which \"country\" predicate is used for other places than countries (as values staying in the object position). Obviously there will be predicates that have a less stricter semantics than \"country\" and I can imagine using them in contexts not designed by their authors. Still in such cases providing extension to the original predicate definition would seem much better idea than just dropping any constraints. And the last thing - what is the primary point of providing types for entities in KBs such as DBpeida? I thought that they, together with the predicates' constraints, may greatly improve the quality of the KB. I know that you can make inferences based on the data, without any ontology or schema or whatever. You can improve the quality statistically, etc. Machine learning is doing pretty well without explicit constraints. But on the other hand - the DBpedia ontology, its types, predicates and the mappings from infoboxes to predicates are constructed manually. We can backtrack any invalidation of the constraints without problem. There might be many reasons for such situation - the object lacks the appropriate type assignment, the constraint is too narrow, the mapping is invalid or the original statement (i.e. infobox field) is invalid. In the last case this is a genuine problem of the data. In the case of the invalid mapping it can be easily fixed improving the quality of the hundreds or thousands of the extracted triples. So I place myself definitely in the schema camp. It would be very interesting to read some arguments coming from the data camp :-) Cheers, Aleksander uI agree totally with Aleksander. Further, Pablo, I also think it was wrong to frame this discussion as data v schema. A true *data* perspective should also respect to what the data applies and in what context. These are the express purposes of domain and range. Mike On 3/22/2014 5:30 PM, wrote: uWell, that is a straw man argument, because you picked a triple that contains a blatant error. Nobody is arguing that Berlin should become a country, or that we should reuse the property country in the wrong context. Nowthere is the need to detect that country is wrong in that context. This could be done with pre-defined domain/range, manually, or can be done automatically with probabilistic models as you pointed out. I am sure that everyone has their own definition for what a truly data driven approach means. But in the context of seeing the schema as constraints vs seeing it as merely more data, a truly data driven approach does not *respect* an arbitrary schema that was created with some unknown purpose in mind. It learns from the actual data usage and makes its own schema on the fly, perhaps even using the constraints in the schema as additional data points. There are arguments for both approaches. But I am not interested in debating this. Like I said, the data camp does not get hurt by more info. So please go ahead and summon the schema camp to contribute those constraints. The more data, the merrier. :) On Mar 22, 2014 8:29 PM, \"Mike Bergman\" < > wrote: uHi Jona, I just want to make one thing clear - if either domain or range has a owl:Thing \"constraint\" you don't export that information to the dbpedia.owl? E.g. location [1] has a owl:Thing constraint for its domain, but it is not available in the dbpedia_3.9.owl file. It is a problem, since it is not possible to tell apart missing constraints from a \"constraint\" that accepts any kind of object. Cheers, Aleksander uMissing link [1] OntologyProperty:Location u uHi, following the discussion of predicate constraints I would like to know where should I discuss the proposed changes. E.g. I would like to change the residence [1] range constraint from Thing to Place, but I would like to discuss it first. Shall I use this discussion group or maybe it's better to use the discussion feature of the ontology wiki. Is anyone observing these discussions? What is more - does it make sense to provide a human-readable description of the predicate? Which predicate should be used in such case? Kind regards, Aleksander [1] OntologyProperty:Residence uHi Aleksander, The wiki talk page seems more appropriate for this. I can create a dbpedia-ontology discussion mailing list if we have more people interested in this (please speak up). The workflow should be like this: 1) make your changes 2) provide the rational for the change in the talk page of the property / class, 3) if needed, discuss and coordinate efforts, 4) use the mailing list for very big changes. BTW, switching from owl:Thing to a high level class & in obvious cases like 'residence' might not need any documentation at all. What do you mean with human-readable description? we have rdfs:label and rdfs:comment available, do you need anything else? {{label|en|residence}} {{comment|en|the place where }} Best, Dimitris On Tue, Mar 25, 2014 at 7:50 PM, < > wrote: u uHi guys, On 3/25/14, 7:07 PM, Dimitris Kontokostas wrote: I completely agree with this. People can go ahead when dealing with minor changes e.g. labels. On the other hand, I would adopt a Wikipedia-style protection policy [1] to avoid under-the-hood major changes e.g. addition/deletion of classes/properties. @Dimitris, do you know if this is technically doable? Cheers! [1] Wikipedia:Protection_policy uHi, <@Dimitris, do you know if this is technically doable?> And would it be possible to give some kind of estimate about the impact of what we would consider to be major changes? So would I, for instance, be very happy if we could move the whole branch of GovernmentAdministriation classes from being a subclass of Place to being a subclass of Agent. In principle, a class is just a tag about the type of collection a resource belongs to, however, maybe there would be major implications as to the domains and ranges of a lot of properties. Regards, Gerard Van: Marco Fossati [ ] Verzonden: woensdag 26 maart 2014 11:13 Aan: Onderwerp: Re: [Dbpedia-discussion] Odp: Re: Re: DBpedia ontology - predicate constraints Hi guys, On 3/25/14, 7:07 PM, Dimitris Kontokostas wrote: I completely agree with this. People can go ahead when dealing with minor changes e.g. labels. On the other hand, I would adopt a Wikipedia-style protection policy [1] to avoid under-the-hood major changes e.g. addition/deletion of classes/properties. @Dimitris, do you know if this is technically doable? Cheers! [1] Wikipedia:Protection_policy uAfter some thinking I came to a conclusion that the best approach is to review the constraints and provide them for as many of the predicates as it's possible. This way there won't be the problem with missing constraints, since they will default to owl:Thing. I started reviewing the constraints yesterday, but I am limiting these efforts to predicates that are most important for my current research (i.e. article categorization against Umbel). There are approx. 300 predicates that I am going to review. If anyone else wants to participate in that effort, let me know, so we can share some of the work or exchange ideas. Kind regards, Aleksander uOn Mar 25, 2014 12:31 AM, \" \" < > wrote:" "Dbpedia for dummies?" "uNot to insult anybody, but it's a constant theme on this site that beginners find it challenging to get a DBpedia instance up and running. This isn't really a flaw in DBpedia, but DBpedia comes up again and again precisely because it such an interesting data set to work with. I know there are no major challenges getting DBpedia up with Virtuoso OpenLink, other than figuring out exactly what to do to install the software (different on Unix and Windows) and run the bulk loader. Once you're familiar with the product you can piece together the steps, but starting from zero people seem to have a hard time. Could we get good step-by-step instructions on the Wiki for how to get DBpedia loaded and running on UNIX and Windows from the bare metal? uOn 14 February 2012 18:46, Paul A. Houle < > wrote: OpenLink have some step-by-step guides: and their wiki is pretty helpful: #How%20Do%20I uOn 2/14/12 1:46 PM, Paul A. Houle wrote: uOn 14.02.2012 20:42, Kingsley Idehen wrote: I have written such a step-by-step intro in Polish It describes exactly how to configure an instance of Virtuoso and load DBpedia data into it. If the resources mentioned in the previous messages are not enough, I can translate this post into English (you can also try Google translates, as the results are pretty good). Kind regards, Aleksander uOn 2/14/12 5:39 PM, Aleksander Pohl wrote: Awesome! We'll add it to our doc collection. uOn 14 February 2012 22:39, Aleksander Pohl < > wrote: It misses \"self-describing\" (samoopisujące), which isn't likely to be in typical dictionaries, and gives 'Installed Virtuosu sources' instead of 'Virtuoso installed from source', but other than that, Google does pretty well. uThanks for spotting this. Fixed. Aleksander" "Video: Using the Ontology2 Edition of DBpedia 2016-04" "uI've gotten a lot of feedback from people that there are a lot of steps involved with using the AWS cloud for the first time to run a product like the Ontology2 Edition of Dbpedia. I made a training video that works you through the steps to create an instance, run SPARQL queries against it, and then terminate the instance and tear down all the related resources that could lead to further cost: That video is to-the-point technical and does not try to explain why a person would want to do that, so you should also look at and note that the examples that I run in the video come from here: I invite you to try it out for yourself." "Announcing Virtuoso Open-Source Edition v5.0.12" "uHi, OpenLink Software is pleased to announce a new release of Virtuoso, Open-Source Edition, version 5.0.12 This version includes: * Database engine - Added Public Key Infrastructure UI in conductor - Added failover/roundrobin support for all client layers - Added support for vectors in IN predicate - Added various small engine optimizations and enhancements - Refactored JDBC driver - Imported PCRE version 7.9 from upstream project - Fixed XA support - Fixed performance of sprintf_more when using large buffers - Fixed memory leaks - Fixed HTTP various issues with HEAD and POST - Fixed serialization on HTTP connection cache - Fixed allow client to run during online backup - Fixed small bugs and compiler warnings * SPARQL and RDF - Added SPARQL graph-level security - Added new RDFa parser - Added support for Concise Bound Description - Added optimization for bif:COUNT - Added optimization for OPTIONAL - Added support for SCORE_LIMIT in bif:contains - Added support for text/n3 mime type - Added support for nvarchar, bigint in sparql/rdfviews - Added support for exclude-result-prefixes - Added support to crawl with multiple threads - Added support for NQUAD, JSON and N-Triples - Added support for DSA certificates - Added MS Docs, Open Office, Google app cartridges - Added CNET, YELP, TESCO, ZILLOW cartridges - Added Goodrelations, Geonames, Bestbuy cartridges - Added Alchemy, Yahoo Boss, Picasa, haudio cartridges - Added cache for common ontologies - Improved support for rdf:XMLLiteral exp in RDF loaders - Fixed handling of special characters in IRI - Fixed RDF view generation - Fixed and enhanced description.vsp - Fixed and enhanced iSPARQL - Fixed charset handling for cartridges * ODS Applications - Added checks for dynamic local - Added expiration so sponger can track changes - Added support for conversations - Added 'Group By' handling for listed messages - Added 'Related' section in posts - Added new API and Ubiquity commands - Added hCard microformat - Added Ontology based editing - Added OpenSearch support - Added support for Smart Folders - Fixed openid, FOAF+SSL - Fixed upstreaming attachements - Fixed RDF gems - Fixed UI profile - Fixed mail filters - Fixed atom-pub protocol - Fixed rewrite rules - Fixed tag URIs - Fixed small bugs Other links: Virtuoso Open Source Edition: * Home Page: * Download Page: OpenLink Data Spaces: * Home Page: * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): OpenLink AJAX Toolkit (OAT): * Project Page: * Live Demonstration: * Interactive SPARQL Demo: OpenLink Data Explorer (Firefox extension for RDF browsing): * Home Page: Regards, ~Tim" "Trouble with XML RDF Data" "uFor a few weeks now, I've been able to make requests to URI's like so: However, this seems to be giving no/lacking data now: xml version='1.0' encoding='%SOUP-ENCODING%' Has something changed recently? Thank you. David Shugars For a few weeks now, I've been able to make requests to URI's like so: Shugars u0€ *†H†÷  €0€1 0 + uYes, it appears to work now. Thank you! David On Mon, Nov 26, 2012 at 8:46 AM, Kingsley Idehen < >wrote:" "Conditional mapping not working?" "uHi again, I just tried to use a conditional mapping and somehow failed. In the infobox \"Ortsteil einer Gemeinde in Deutschland\" ( you have a property whose semantic changes depending on whether another property is set or not set: - \"Höhe\" = average elevation, if \"Höhe-bis\" is not set, - \"Höhe\" = minimum elevation, otherwise. So I applied a conditional mapping to distinguish the two cases. Unfortunately if the otherwise branch holds true for a page then no template extraction for this page is performed, at all! If you test the mapping of you will recognize this behaviour at the first two entries displayed (Hellerau and Pillnitz). So what is wrong with the mapping? A second issue: Are there any possibilities of performing a more general conditional mapping? Sometimes it would be useful to have some conditions which are independent of each other. Say if in the former infobox there would have been the option to set the unit of area by appending \"acre\" or \"km2\" to the actual value. How would you realize the additional condition? And a third one: What if there was the need of linking several conditions? If the user could also choose the unit of elevation there would be the need to \"and\" the \"Höhe-bis\"-set condition and the \"Höhe\"-contains-\"acre\" condition. Is that possible? Thanks for your help in advance, Bastian uHi Bastian uHi, i think i would be really helpful for the mapping creators - me included :) - if they could test custom wikipedia pages and not just the first 5-6 in the list. There is already a greasemonkey script for DBpedia from Anja Jentzsch ( it could be easily modified to link to a custom mapping server page, provided the server supports custom page extraction. If the server supports it, you can send me the url and I will create the script in order to be included it in the mapping Wiki. Cheers, Dimitris On Wed, Apr 6, 2011 at 2:48 PM, Paul Kreis < > wrote: uHi Paul, You were right, the list seems to be outdated. I just temporarily modified an article which used the infobox \"Infobox Ortsteil einer Gemeinde in Deutschland\" and everything was extracted flawlessly. Let's just say we have an infobox like: {{ Infobox settlement | area = | elevation = }} If valid units of the values were km2/mi2 (area) and m/ft (elevation) - the user would then have to specify the unit when using the template, e.g. \"area = 23 km2\" - you'd have two independent categories. In the mapping you'd have to make a condition based on which unit string is contained in the area property. The same holds true for the elevation property. But as far as I understood the syntax of mappings you cannot nest the conditions, can you? Am I right with the assumption that conditions have to be at the first level and that the conditional mapping statements have to be whole templatemappings? It would be nifty to be able to define the mapping like this: {{ TemplateMapping | mapToClass = Settlement | mappings = uHello Bastian, first, units are parsed by the UnitValueParser. \"If a template property containing a numerical value and a unit is mapped, the unit has to be defined (Please use only values from DBpedia unit and dimensions). If a template property has no default unit defined, e.g. its values can contain different units of the same dimension, the dimension has to be defined for usability as well as validation reasons. Possible dimensions are Length or Mass.\" ( In your example no default unit is given, therefore the dimension is sufficient: {{ PropertyMapping | templateProperty = area | ontologyProperty = area | unit = Area}} The UnitValueParser recognises the used unit and converts it to the standard unit of the Area dimension. Second, if you take a look at the {{ConditionalMapping | cases = {{Condition | templateProperty = divisio | operator = contains | value = Magnoliophyta | mapping = {{TemplateMapping | mapToClass = FloweringPlant }} }} {{Condition | templateProperty = classis | operator = contains | value = Chondrichthyes | mapping = {{TemplateMapping | mapToClass = Fish }} }} {{Condition | templateProperty = regnum | operator = contains | value = Plant | mapping = {{TemplateMapping | mapToClass = Plant }} }} best regards, paul" "Error Unknown language" "uHi, I've got a problem querying public DBpedia sparql endpoint ( interface to run the query. Problem occurs when I want to execute the same query through my application. I'm using the HTTP POST method to pass the input encoded query but the server response is \"*500 SPARQL Request Failed\"*with following message included: *RDFXX Error Unknown language in DB.DBA.RDF_LANGUAGE_OF_LONG, bad id 0* My query looks like this: PREFIX rdfs: PREFIX foaf: PREFIX db-prop: PREFIX db-ont: PREFIX geo: SELECT DISTINCT ?subject ?label ?comment ?homepage ?page ?thumb ?photo ?latlong (sql:rnk_scale ( (?subject))) AS ?rank WHERE { ?subject rdfs:label ?label . OPTIONAL { ?subject rdfs:comment ?comment } . OPTIONAL { ?subject foaf:homepage ?homepage } . OPTIONAL { ?subject foaf:page ?page } . OPTIONAL { ?subject db-ont:thumbnail ?thumb } . OPTIONAL { ?subject db-prop:hasPhotoCollection ?photo } . OPTIONAL { ?subject geo:geometry ?latlong } . FILTER(bif:contains(?label, '\"berlin\"') && langMatches(lang(?label), \"en\") && langMatches(lang(?comment), \"en\")) . } ORDER BY DESC(?rank) LIMIT 3 Thank you for your help Zbynek Botlo Hi, I've got a problem querying public DBpedia sparql endpoint ( Botlo uHave you tried single quotes (ie: 'en' )? Or this syntax?: FILTER ( lang(?label) = 'en' ) Zbyněk Botló, 01-12-2010 19:50: uThank you for your reactions Mirela and Alex. Unfortunately my problem is still there even though I tried what you were suggesting. I really don't understand why sometimes everything works fine and other time I get these errors with the SAME sparql query. I expect for one query always the same response but it is not. I don't know what are my options here. 2010/12/2 Alex Rodriguez Lopez < >" "DBpedia in ReadWriteWeb’s Top 10 Semantic Web Products of 2009" "uDear DBpedians, The new year is slowly approaching and people start compiling their top 10 lists of 2009. The popular Web technology blog ReadWriteWeb picked DBpedia as one of the top 10 Semantic Web products of 2009. Its actually the only non-commercial community project in the list and in good company with products such as Google’s Search Options and Rich Snippets, Apperture and Data.gov. Other picks, which btw. heavily use or link to DBpedia, include OpenCalais, Freebase, BBC Music and Zemanta. Read the full article at Sören" "DBpedia live: issues with classes and properties" "uHi, I've just figured out some issues in the DBpedia Live export, compared to the \"static\" version. First, it seems that RDF typing is not fully done in the live version. Eg vs The static version expose values for rdf:type such as: • dbpedia-owl:AdministrativeRegion • dbpedia-owl:PopulatedPlace • dbpedia-owl:Place That are not in the live version, which actually does not expose any types from the DBpedia ontology (yago + cyc only) Then, property export is also done differently. Considering the same resource, the static version exposes its mayor using dbpedia-owl:mayor and dbprop:mayor But the live version exposes only dbprop:major However, considering the ontology / mappings wiki, it seems that dbpedia-owl is the right way to go, right ? Is that something that the team is working on ? Or, should we consider the current DBpedia live export as the way DBpedia will go re. typing and the ontology used ? Thanks, Alex. uHi, DBpedia-Live is still working on the old PHP framework, so it uses old mappings, which are fixed and can not be changed. It basically has these three major issues: - Mappings can not be edited - Mappings are only used partially - Abstracts are not correctly updated (only abstract_live) We will not fix this in the current setup (PHP framework), but we are already quite far rewriting the Live framework in Java. This will have full and up-to-date abstracts, full support of everything in the Mappings Wiki and will structurally be similar to the latest DBpedia dump, albeit only for English language. Mohamed Morsey [1] is currently working on the rewriting and he is almost finished with the development and is moving on to testing. The problem is that the Live Extraction is difficult to test. Lots of data, lots of updates and is hard to reproduce errors, because they occur in one article at a certain revision only. After the basics are finished, we will set up a store with the new live framework and then we need feedback from the community and users in order to find some more bugs. After this we will commit some time in update logs (in conjunction with OpenLink ) and also will implement some update strategies for the case that mappings change in the Wiki (imagine you change a property mapping and 10.000 triples have to be changed, with the most difficult part is finding the triples). By the way, we still hope someday to be able to move our mappings wiki within the unified login area of the Wikipedias and in general do the editing there (i.e. add more Semantics). Regards, Sebastian [1] Am 18.08.2010 20:31, schrieb Alexandre Passant: uHi, On 19 Aug 2010, at 09:09, Sebastian Hellmann wrote: Ok - btw, will the \"old\" dbprop:xx URI remain in the live export ? (as in the current dump that often have both dbprop: and dbpedia-owl: statements Thanks for the informations, do you have any ETA for these steps ? BTW, is there any community validation of the mappings ? I see that editors must be registered, but do you plan an additional step - or just keep it the wiki way where every registered user can update mappings ? What do you mean ? Edit the mappings in Wikipedia directly ? Thanks, Alex. uHi, Am 19.08.2010 10:43, schrieb Alexandre Passant: Yes, they need to stay in there for three reasons: Legacy support and stability, better coverage, debugging . Hm, The best I can say is that they will probably happen sequentially in that order ;) maybe about 1 month to see first results. What do you propose? We could include something like the German Wikipedia has. They have \"Approved versions\", i.e. you can edit anything, but it first comes into effect, if a member of a \"core\" community approves it. This would not be a bad idea. But we are open to other suggestions, in case you have any. We started an attempt to keep everything that is in the mappings Wiki now in Wikipedia. But it failed for several reasons It is a hard read: The general idea is that people from Wikipedia and DBpedia can encode formal facts and information directly in the source, i.e. Wikipedia. As an example on article level, there could be a template in an article that includes owl:sameAs link. The Wikipedia article itself would then be the place to maintain that link. Regards, Sebastian uOn 19 Aug 2010, at 10:03, Sebastian Hellmann wrote: Ok - quite understandable for 1st and 3rd. Regarding the better coverage, do you have any pointer where I can find what's covered by these properties, but not by the new ontology ? Ok, thanks. Is the current codebase of the live extractor available from SVN / GIT ? I was thinking of something similar, or maybe with a more community oriented workflow. Once a change is proposed, there's a short timeframe (e.g. 3 to 5 days) for voting. If only +1 or no votes after the deadline, change is approved. If at least one -1, that's subject to discussion and the mapping is not validated unless -1 is removed. Right, thanks for the links ! Best, Alex. uAm 19.08.2010 11:14, schrieb Alexandre Passant: That is a good question and afaik nobody has anaylsed this extensively. You can look e.g. on page 10 chapter 3.3 . The numbers changed slightly. It is a maven module: and depends on core: I moved that thought here (A lot of ideas get lost on the mailing list) (The Wiki is also free to register and edit) Regards, Sebastian" "object property extractor should check rdfs:range" "uConsider dbo:firstAccentPerson, which declares rdfs:range dbo:Person. The object property extractor does not check range when extracting, so it extracts any link that people have used, and a lot of them are not people: - bg.dbp:Ëõîòöå: 18_ìàé and 1956: these are \"event list\" pages that someone linked instead of providing a plain date - dbp:Abi_Gamin: United_Kingdom and Switzerland (it was a mixed British-Swiss expedition) - dbp:Gunung_Tok_Wan: Kajang (a location), because someone wrote \"A small group from Kajang Prison Officer\". - dbp:Stawamus_Squaw: dbp:Prehistory, because someone mentioned when it was first climbed (a HistoricPeriod) There are many other examples, eg check with this query: ``` PREFIX dbo: PREFIX dbp: select * {?x dbo:parent ?y filter not exists{?y a dbo:Person}} ``` and you'll find many strange values for dbo:parent, such as Archbishop, Corfu, All My Children, Adoption, etc. You can find all exceptions with a query like this, but it's too expensive for the public endpoint: ``` select ?x ?p ?y { ?x ?p ?y. filter exists {?p rdfs:range ?c. ?c rdfs:subClassOf owl:Thing filter not exists{?y a ?c}}} ``` We intend to find and publish a list of exceptions and analyze some of the root causes. I think this is the single most important improvement that can greatly increase the quality of DBpedia data. The way it is now, RDFS domain & range reasoning has disastrous results. Posted as 286 uIMHO low quality data is never better. Don't know what you mean by \"statistics-based approach\", but if you mean Machine Learning, bad data is disastrous to learn from. I disagree that data quality should be left to a \"do it yourself\" approach. Do you have specific objections to the DBO ontology? I have some, mostly related to redundant/non-orthogonal properties. But I don’t see serious defects in the class hierarchy. Data should not just be thrown away: the extractor should also make a list of exceptions, to be given back to the: - DBpedia Mapping community - Wikipedia editorial communities uOn 12/8/2014 12:18 PM, Vladimir Alexiev wrote: Sure, that's all the cases where the value of a property is not classified (as the corresponding page has no infobox or doesn't exist in DBpedia). As you might know, there already exist \"cleaning\" algorithms that post-process the extracted data [1]. But it will be of course useful to improve the extraction heuristics, and even to implement a new extractor that rigorously follows the ontology constraints - would be great if you contribute here, given your interest in the topic. Best, Volha [1] Downloads2014#mapping-based-types-heuristic uThanks for that excellent description page! Includes detailed descriptions and previews of almost each download file. Highly recommended! I've looked at Mapping-based Types (Heuristic), and indeed it includes useful \"stub resources\". E.g. from a football team roster, it infers FootballPlayers, with classes & name, and position (jersey number, team). E.g. from and the redlink \"Ryan Sappington\", it infers - stub person: - position: It could even infer nationality (USA) and team position (MF=midfielder) But my examples (and the posted issue When an *existing* resource violates a cooked object property range, it should not be extracted (And the second idea: data property ranges should provide hints to the literal extractor) Let's investigate dbo:parent on several dbpedias with most Persons. select ?person (count(*) as ?c) { ?x dbo:parent ?y. optional {?y a ?person filter(?person=dbo:Person)} } group by ?person | site | Person | NonPerson | Bad | | | uHi Vladimir, The cleaning heuristics should be described in the same paper as the type inference heuristics. Heiko Paulheim is the reference person. By the way, the cleaned version is the one loaded into the endpoint [1]. I will look into the country issue, this is not because the corresponding triples were filtered by the cleaning heuristic, but likely due to an older version of the \"mappingbased_properties_cleaned_en\" - not updated after we corrected some errors in the mapping wiki. But the \"uncleaned\" dump and the data in the endpoint should be ok. Volha [1] DatasetsLoaded2014" "Broken Turtle for David Bowie?" "uI turned my attention to somebody more people would care about (I think the only person who got a multiple page obituary in The Economist.) I do a request with jena, which gets the TTL file and gives the error Exception in thread \"main\" org.apache.jena.riot.RiotException: [line: 781, col: 29] Failed to find a prefix name or keyword: –(8211;0x2013) As indicated in the error message, the character in question is an en dash, which is unicode codepoint 0x2013. dbr:The_Deram_Anthology_1966–1968 dbp:artist dbr:David_Bowie ; where the dash appears between \"1966\" and \"1968\". I think this is a DBpedia problem because looking at the productions in the Turtle Spec [163s] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] [164s] PN_CHARS_U ::= PN_CHARS_BASE | '_' [166s] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040] I don't see the en dash in any of those ranges and I don't see it in the list of characters you can escape with a \ so I think you have to encode this in something other than prefix form. Could we get this fixed? uHi Paul, I think this is a serialization issue. The IRI is valid as an IRI but cannot be prefixed in turtle. if it was serialized as a full IRI it request would be valid. Can you try to get it in another format like nt? On Fri, Jan 27, 2017 at 12:50 AM, Paul Houle < > wrote: uI just tried and then deserialized this successfully with Jena. The difference in the data is more than just serialization, because all of the facts in the ntriples file have ?s=:David_Bowie whereas the ttl file has facts like ?someSong dbo:writer :David_Bowie . The file at on the other hand, looks more like a Turtle file because it has prefixes, unlike a real N-Triples file. The .json file has back links, but it is in a non-standard format. The .jsonld file does NOT have the back links. I fell back on the .xml which does have the backlinks, even if it makes me feel like I have become pond scum, destroyer of soap. uMaybe openlink can verify but all these linked data requests are transformed to sparql describe queries. And unfortunately, sparql describe does not have a normative definition. This indicates that the results cannot be deterministic. I think in VOS forward links are guaranteed but back links not (but not 100% sure) On Jan 27, 2017 20:19, \"Paul Houle\" < > wrote: u0€ *†H†÷  €0€10  `†He" "Several labels and abstracts for the same article?" "uHi list! Regarding this query: PREFIX owl: PREFIX dbo: PREFIX rdfs: SELECT DISTINCT ?uri ?name ?summary where { ?uri rdfs:label ?name ; rdfs:comment ?summary ; dbo:country . FILTER( lang(?name) = 'pt' && lang(?summary) = 'pt' ) } ORDER BY ?uri This should list 'things' in portuguese where country is Portugal. However I would expect each 'thing' or URI to have exactly one label and one short/long abstract per language. As this entry shows ( ) 'Universidade Lusófona' has several labels and abstracts with lang = 'pt'. When inspecting the dumps I can see that each entry for this resource URI has 1 label and 1 short and long abstract, as expected. Where are this extra labels / abstracts comming? Is this because of redirects? How could I modify the query above to return at most one entry for each URI? TIA for any clarifications!" "Dbo vs Dbp" "uHello I do not know when to use dbo and when to use dbp when we query using dbpedia. Kindly if some one can explain how to use these in which scenarios? Regards Hello I do not know when to use dbo and when to use dbp when we query using dbpedia. Kindly if some one can explain how to use these in which scenarios? Regards uHello Kumar, we differentiate between information extracted from the wikipedia dumps without an alignment to the DBpedia ontology (raw extraction) and the mapping based extraction (based on mappings between wikipedia infoboxes and the ontology). - extracted from the raw infobox extraction - ontology I also refer you to the description of the mapping-based properties dataset (containing the mapping based properties) : Mapping-based Properties *High-quality data extracted from Infoboxes using the mapping-based extraction. The predicates in this dataset are in the /ontology/ namespace.* Note that this data is of much higher quality than the Raw Infobox Properties in the /property/ namespace. For example, there are three different raw Wikipedia infobox properties for the birth date of a person. In the the /ontology/ namespace, they are all *mapped onto one relation* to unify these relations. I hope this is helpful, Markus On Wed, Feb 24, 2016 at 5:46 PM, kumar rohit < > wrote: uKumar: - if a property you need is in dbo: use that. But double that for the objects in interest, it is indeed present in most of them - if a property you need is not in dbo: or many of the objects of interest are missing it, use dbp: . As described by Markus, you may need to use several dbp: props, especially across different language DBpedias: each dbp: is named in the local language In addition to the merging of several dbp: into one dbo:, there is another advantage of dbo: - dbp: values of an array of subobjects are not correlated between each other. But if the mapping is properly modeled (e.g. using IntermediateNodeMapping), they can be properly correlated in dbo: That is the theory anyway :-) But dbp: come about automatically: each template property is emitted as a corresponding dbo: property In contrast, dbo: only come about, when someone has made an appropriate mapping. There are at least 2 potential problems with dbo: that a consumer needs to be aware of: - dbp:birthDate may be mapped to dbo:birthDate for 100 templates, but missing for the 101st template (or the mapping for that template may be missing altogether). How would a consumer detect this problem? - each dbo: is either an ObjectProperty or DatatypeProperty. This means that each returns only URLs, or only literals, but not both. dbp: doesn't have such restriction. Search for \"dichotomy\" in Cheers! Vladimir" "Questions about dbpedia data" "uI am just beginning to work with the dbpedia dataset and have the following questions a) Is the full wikipedia article text available somewhere for download. I found the short abstracts and the long abstracts but was looking for the full article text. If not any tips on how to get the article text from the wikipedia xml dumps would be appreciated. b) Is there someway to create a tree of Categories ? I am looking to create a category hierarchy and use it for data mining. Thanks for creating such a valuable resource. Regards, Deepankar" "Odp: Hello DBPedia!" "uHi, following the discussion regarding deficiencies of the DBpedia ontology, I would like to make you know that a team of collaborators sponsored by Structured Dynamics is working on classification of the Wikipedia articles into the Umbel [1] ontology. Yet another ontology, you may think, but Umbel is exceptional in the regard that it is entirely based on Cyc [2] - a more than 20-year effort of building an ontology large enough to capture entire common sense knowledge. We address many of the problems that are present both in Wikipedia and DBpedia, e.g. things like list and disambiguation pages, administrative categories in Wikipedia, compound categories in Wikipedia, etc. The good news is that exactly this week we've produced the first \"final\" result of the classification, i.e. around 90% of the Wikipedia articles has a Umbel type (we use ~4k different types in the classification) assigned via a mapping between Wikipedia categories and Umbel types. I am writing in quotes, since that result still has to be validated and curated and we are probably not going to publish it just at the moment. Still, I expect some consumable results to be produced in May, so stay tuned. What is more - regarding the problem of integrating the data from different data sources - I used Cyc, especially its conflict detection mechanism, in the past [3] to integrate classification results produced by various type assignment methods. For me it seems to be the best resource applicable in that problem, since besides the generalization relation, which is found in all the ontologies, most of the classes have disjointWith statements, even on a level of Cats and Dogs (i.e. individual species). As a result it is relatively easy to keep the data coherent. Kind regards, Aleksander [1] [2] [3] paper2.pdf uOn 4/10/14 5:40 AM, wrote: Yes, I am aware of UMBEL [1]. Great to see that its being brought up to date etcWhen will you have something published in RDF form? [1] 0071.html" "Announcing OpenLink Virtuoso, version 7.2.0!" "uHi, OpenLink Software is pleased to announce the official release of Virtuoso, Version 7.2.0. The new release includes: * Loosely-coupled SSL/TLS SSL/TLS version and associated cipher selection is now configurable. Net effect, you can now explicitly disable the use of specific SSL/TLS versions (e.g., SSL/TLS 3.0, which is susceptible to the POODLE exploit). * Improved LDP Support Linked Data Platform protocol support has been enhanced, as an extension to existing WebDAV protocol support. Specific WebDAV folders can now be designated as LDP Containers, and once this designation has been made, LDP-aware user agents will be able to deductively interact with WebDAV Folders and the documents they contain, using entity relations embedded in both header and body of HTTP responses. * WebDAV access to 3rd Party Storage Services 3rd Party Storage Services (e.g., Dropbox, Amazon S3, Microsoft OneDrive, Google Drive, Box, Rackspace, etc.) are mountable and un-mountable using an asynchronous operation (in prior releases, this was a synchronous operation). Once mounted, these 3rd party folders function like any other WebDAV resource. This kind of WebDAV collection (a \"Dynamic Extension Resource Type\" or \"DET\" for short) can also be designated as an LDP Container, making Virtuoso WebDAV a powerful LDP proxy mechanism for storage services that do not currently support LDP. * File System Hosted Virtual Tables Enables Virtuoso instances to attach to CSV documents hosted by the File System. Once attached, VIEWs of these documents can be represented as SQL Relational Tables (by default) and/or RDF Property/Predicate graphs (via RDF Views functionality). * Loosely-Coupled Sponger Middleware Services Built-in RDF document transformation middleware is now loosely- coupled to its host Virtuoso database server instance. As a result, you can now use the POST method to request asynchronous data transformation services from a Sponger instance, with the transformed data returned as part of an HTTP response payload. * Built-in Nanotation Processor Enables automatic transformation of TURTLE statements embedded in Email Messages, Online Discussion Posts, Tweets, Social Media Posts (Facebook, LinkedIn, Google+), HTML Body, and Plain Text documents into RDF-based Linked Open Data, which is then stored in the host Virtuoso instance. You now have the freedom, power, and flexibility to create data wherever and whenever" "Data inconsistent and missing" "uHi, this is the first time I have used dbpedia and am running into some problems. I am trying to query a list of languages and am trying to get the following properties: Name Rank Foaf:Page Speakers Although all pages on Wikipedia have all of the properties I am looking for, dbpedia does not or the properties don't make sense. I am using the following sparql query: SELECT ?lang ?name ?rank ?page ?speakers WHERE { ?lang ?p . ?lang dbpedia2:name ?name . ?lang foaf:page ?page . ?lang dbpedia2:rank ?rank. } order by ?rank This finds most of what I am looking for but there a couple of inconsistencies: in particular the rank property of the modern English language is showing a url to another site. If I use the following query to try to get the number of speakers then the result is cut dramatically: SELECT ?lang ?name ?rank ?page ?speakers WHERE { ?lang ?p . ?lang dbpedia2:name ?name . ?lang foaf:page ?page . ?lang dbpedia2:rank ?rank. ?lang dbpedia2:speakers ?speakers. } order by ?rank In particular here, many languages are missing as there is no rank property yet the Wikipedia pages show the property. As I said I am new to this so am unsure whether this is to be expected or not. If it is then is there a way around this. Basically I need to query languages and get the properties mentioned above. Any help would be greatly appreciated." "„When it comes to data, 90% is worse than nothing“" "uHere's a blog post that is quite critical of DBpedia, and raises some interesting points: Quoting: uRichard Cyganiak wrote:" "Data quality improvements / SF bug-tracker" "uHi all, we want to improve the DBpedia data quality and need your feedback. What problems did occur to you when loading the datasets? What kind of wrong extractions did you see in the data. Are there any undetected units, currencies etc. Where did problems occur with list parsing? We already tried to extensively analyse the datasets, but also rely on the support of users and the community to establish a better extract-release cycle for the DBpedia datasets. Please submit all known problems to the DBpedia bugtracker at: Please include: * the dataset in which you found the problem * the triples concerned * a description of the problem and how it should look like (from your point of view) We will then try to come up with a solution, fix the extraction algorithms and publish improved datasets as soon as possible. Thanks Sören" "Clarification regarding the instance type files" "uHi, I am a software developer, we use DBpedia instance type or mapping-based type files in a pipeline to recognize entities. We found that the latest instance-types resource available at is much smaller than the corresponding 2014 release . As a result, the latest instance file is missing many entries present on Wikipedia such as Taj_Mahal, J._Paul_Getty_Museum, Grand_Canyon. What is the reason for the reduced size (110MB->35MB) Is this a bug? Are there some other files that we have to consider along with this file? We also sometimes see entries with '', as in \"Abraham_Lincoln1\" in the line < What does '' mean? Where can I find more information about these things. Thanks uHi Vihari, The main reason for the size reduction is due to the split between direct & transitive types [1] There was a bug [2] that indirectly affected some type assignments but is now fixed and the next release will not have this problem. Also note that besides SD-Types, in this release we published two additional type datasets, dbatx and LHD [3] Regarding your 2nd question (''). These resources are extracted from additional infoboxes in the same page but when they cannot be merged, we create additional resources. This is also a way to create intermediate node mappings through the mappings wiki e.g. in [4] [1] [2] [3] [4] On Mon, Dec 14, 2015 at 1:12 PM, Vihari Piratla < > wrote: uThanks Dimitris for a detailed response. I see 2,945,956 unique titles in instance-types_en.nt.bz2 and 2,716,774 unique titles in instance-types-transitive_en.nt.bz2. The number of unique titles in the two files together is 2,945,956. Currently, Wikipedia contains 5,031,836 articles in English. I am assuming the dump is missing 2 million or so titles because of the bug in the extraction framework. When can we expect the 2016 release? Thanks On Mon, Dec 14, 2015 at 8:53 PM, Dimitris Kontokostas < > wrote: uThe next release (2015-10) is underway. we will announce a beta release soon Regarding article coverage, we never had 100% coverage of all wikipedia articles 2014: 3.02M (typed main articles) out of 4.58M articles 2015-04: 2.95M (typed main articles) out of 5.03M articles so due to the bug we provide ~ the same number of types resources as the previous release but we should have provided (a lot?) more since the article number increased Best, Dimitris On Tue, Dec 15, 2015 at 12:27 PM, Vihari Piratla < > wrote: uTwo other sources you might consider are Freebase and Wikidata. Using them together with DBpedia might give you better results. Tom On Tue, Dec 15, 2015 at 5:27 AM, Vihari Piratla < > wrote: uThanks everyone for your suggestions and clarifications. On Tue, Dec 15, 2015 at 10:18 PM, Tom Morris < > wrote: uNot necessarily. There are very many articles without an infobox, and they don't get a \"SD type\" (as Dimitris called it). They may get a heuristic type, dbtax type or LHD type. Can you find some examples of articles with infobox and without type?" "Semantic Bookmarking Service: Faviki" "uHi all, just found a new semantic bookmarking service called Faviki [1] which uses DBpedia for tagging content. Great stuff :) But I'm wondering if these guy can show some more value from using \"semantic\" tags. And I hope they will start publishing their data as Linked Data Cheers, Georgi [1] uHi Georgi Thanks for the pointer. Faviki is indeed great. I'd never be convinced by any social bookmarking so far, because I found tags so messy, but I've adopted Faviki right away ( adoption curve seems to be steep since yesterday. One interesting potential feature is the feedback towards Wikipedia editors themselves. They can tap in the resources indexed on their favourite articles to improve the article content. So this is a good example to monitor of social semantic feedback. Bernard Georgi Kobilarov a écrit : uHi Bernard, Faviki is a service that I have waited for a long time. I used del.icio.us in the past, but its social component was mostly irrelevant to me. I want to use a bookmarking service for my own personal benefit of storing and remembering websites. This social stuff is great if it makes it easier for me to reach my goal, and the delicious tag recommendation is a good example. People tag their stuff, and by doing so they help other users. But I would never contribute to delicious *just* to help other users (and to train the recommendation algorithm). Here's a fantastic blog post about that relation of personal benefit and social systems: Faviki has great potential, but IMHO they have to find a way to use DBpedia's semantic graph to provide additional *personal* value to their users. The potential feature of contributing to Wikipedia you've mentioned is indeed very interesting, but the system has to be built in a way that these social contributions happen as a side effect of people using the system because they love it for the personal problems it solves. Cheers, Georgi uOk, I looked a bit closer at Faviki, and the way they use the categorization and classification data is brilliant. Still wondering whether/how this will actually help me in daily use, but I'm very excited!!! uHi all, Faviki provides RSS 1.0 for a bookmark list, and tags to each item are included as DBpedia URIs, e.g. Notice that, besides somewhat obsolete taxo:topic + rdf:Bag/rdf:li model, DBpedia URIs are 'data' rather than 'resource'. What do you think, as tag object, is good URI in this case ? I guess would be straightforward, but any rationale to use 'data' URIs here ? cheers, uThanks for the pointer to their RSS! The use of dbpedia.org/data/ is clearly wrong! Must be dbpedia.org/resource/. I don't have an opinion about dc:subject compared to taxo:topics + Bags Cheers, Georgi uHi Georgi Georgi Kobilarov a écrit : In this case, why not simply using your browser bookmarks? If I think only of personal benefit, the one I see in social bookmarking is discovering more relevant resources around the ones I know of. Sharing is a win-win strategy, I think all the history of web technologies is there to support this thesis. Of course. This is not a question of technology, it's the basis of sociality. Why do we exchange on this forum or other ones? Well, it's amazing that people need to look at social bookmarking to re-discover the basic principles of social life. They are not different for the network. Hanging my favorites on DBpedia concepts is indeed adding personal values. The auto-completing tagging made me discover already Wikipedia stuff I did not know. Not to mention other bookmarks that other will tag with the same. And navigation to related topics and categories makes me discover more. The personal benefit is obvious for me. Well this is a strange assertion to me. Why do people contribute to Wikipedia? What is the personal reward? Don't forget that DBpedia, and hence the possibility of a tool like Faviki, and of DBpedia being the backbone of the Linked Data cloud, is just a \"side effect\" of the huge work of the Wikipedia community. Granted DBpedia formalize the semantics, but if the semantics were not already implicitly embedded in Wikipedia pages (and first of all, its subject-centic nature), there would not be anything to build upon. So why would this very community not use the social semantic loop, the added value of resources bookmarked on DBpedia concepts, to augment the Wikipedia content itself, and enter in a virtuous semantic circle? I think we have to see those tools in the Big Picture of collective intelligence emergence. The social success of Wikipedia has proven that this is not a void concept. Cheers Bernard uHi DBpedia Folks! Please sorry for my delayed responses. I'm very glad that folks from DBpedia like (and use) Faviki! I think DBpedia is a great project, and hope it will continue to develop. Version 3.0 has some great improvements (especially disambiguation and redirect extractions, and I hope they will be perfected in the future). Abstracts are pretty important, and I think they should be better. I believe mess appears because of removing urls from text (besides Wikipedia inconsistency)? Did you consider another alternative abstract with urls left inside? I noticed links (to other Wikipedia articles) in the first couple of paragraphs of Wikipedia article are usually pretty related to article itself. If abstract in Faviki have links (to Faviki pages about tags), or an option to show only that related tags, I believe it would be another great way to navigate and learn about stuff. I agree that general problem with social and collaborative systems is that they must provide a benefit for the user, when there are not much data. It seems that good ideas often fail because they're highly dependent on large number of users and their activity. In my opinion, the challenge is to make a system which will be useful if there is only one user, and let the real benefit (coming from many nodes and interconnections) comes later. That's what I'm trying to do with Faviki. :) In my vision Faviki will evolve toward connecting the people with tags and websites and defining types of webpages (using predefined predicates), that was actually the original idea of Faviki, but I decided to start with smaller and more focused version in order to provide easier adoption and understanding of its key values I'm glad we have a similar thoughts about connecting it with Wikipedia. I really believe there is a potential, and that Wikipedia can benefit, too. I fixed RSS- changed DBpedia URI from 'data' to 'resource' It would be nice to keep in touchI would like to hear suggestions and ideas from you, from the perspective of DBpedia creators and Faviki users. Best wishes from sunny Belgrade, Vuk Milicic uHi Vuk, 2008/5/28, Vuk Milicic < >: Very nice to see such a prompt action. Thanks ! May I suggest a few more points on Faviki RSS ? (1) Item URI: current RSS uses bookmarked document's URI for item's rdf:about. This would cause several problems: - dc:creator means its object is a creator of the subject (item). Since this RSS assigns a user to dc:creator's object, this asserts that one who bookmarked had created the document, which is not true in most cases. - once aggregated, items with the same URI will be merged, and then a document (denoted by the URI) will have multiple dc:creator, dc:date etc. For example, see RSS for one document (e.g. I guess the items here would be bookmarks rather than target documents. So why not assign a bookmark URI (e.g. use to provide the target document URI so that RSS readers properly show hyperlinks. (2) rdf:Bag/rdf:li : the RDF container model has some problems (e.g. for SPARQL query) and is generally considered obsolete. Would you use direct multiple properties for multiple tags, e.g.: This is not a major suggestion, but certainly welcome improvement to already great RDF data. Thank you. uHi Kanzaki, Thank you very much for your suggestions! I changed item URIs and rdf:bag part. Can you please check it out (e.g. I am not sure what should I do in the case when bookmarks are sorted by popularity (e.g. blank. I'm not sure if this is ok. You can check it on Thanks, Vuk On Thu, May 29, 2008 at 3:51 AM, KANZAKI Masahide < > wrote: uHi Vuk, Thank you very much for RSS updates! Looks marvellous. Because it's getting so nice, I can't help asking to fine tune the model a little bit more. May I ? (1) dc:subject Now the items denote bookmarks, not target documents, it seems somewhat strange that an item (a book mark) has dc:subjects (because they are the subjects of the target document). If you don't mind adding one more namespace declaration, it would be better to use Tag ontology [1], and replace dc:subject with t:associatedTag, e.g.: xmlns:t=\" (document_uri) etc. (It would be still better if you include to generate a triple to connect the tag=item and the document, but I'm afraid this might be too much asking you ?) (2) value of dc:creator It's good idea to have user (tagger) URI as value of dc:creator. It would be much better if you use instead of literal value ( http). (3) non declared XML entities For some reasons, this RSS includes several XHTML entity references such as â, ¦, ã, etc. These are declared in XHTML DTD, but not predefined XML entities. This makes the RSS ill-formed, and would cause fatal errors in XML applications. (This is not an RDF issue, but RSS/XML in general). I'd suggest use numerical reference such as ģ (or %HH encoding for URI) if necessary. OK, in this case, the items are the target documents, rather than tags assigned by Faviki users. So, you can safely use target document URIs for rdf:about attributes on each item (and rdf:li in items). You can still use thing as document uri does. The values of dc:creator and dc:date for this item should be those of the target document, which are usually not known in bookmarking system. Hence, you can omit these properties (better than blank properties). In turn, dc:subject is fine in this case. If you are interested in Tag ontology, you might want to use t:taggedWithTag in place of dc:subject (this would look somewhat tricky, but works). I appreciate very much your effort to make Faviki RDF output better. thank you and best regards, [1] uHi Kanzaki, Thank you so much for your time and advice. Please! I'm not the semweb expert, so every advice is welcomed :) I did it, but have two concerns about this: - Will RSS readers/aggregators (e.g. friendfeed) will still be able to read RSS easily? - I am considering using MOAT ontology, too. As far as I understand it, MOAT extends Richard Newman's tag ontology and provide a way to deal with tags with different meanings. I'm not sure if this is needed for Faviki, because ambiguous tags are automaticly redirected. Did you mean ? Should I leave tag or remove it completely? Can I remove some of the namespaces that are not used, like ' Best regrads, Vuk On 6/1/08, KANZAKI Masahide < > wrote: uVuk Milicic wrote: [snip] Please use the MOAT and SCOT ontologies in conjunction with SIOC. To get a feel for the end product of such an endeavor see these views of my Tag Clouds via the OpenLink RDF Browser: 1. My Blog Data Space Tag Cloud 2. My Shared Bookmarks Tag Cloud 3. My Feeds Subscription Tag Cloud 4. A Collection of Tag Clouds (1-3) The Tags are bound to SKOS, MOAT, and SCOT, with SIOC used for Containment (i.e Data Space partitioning). All the Tags have URIs and these Tag URIs are associated with Meaning URIs (courtesy of MOAT). SCOT is used for the Tag stats, and SKOS for Conceptualization and Preferred Labeling. Of course, my tags are a mesh of del.icio.us and Technorati tags based on the bonding that exists between by ODS instance and these Tag oriented Data Spaces (i.e. Technorati and Del.icio.us). The following URIs will expose the Tag Clouds above in any RDF aware user agent (e.g. Zitgist Data Viewer , Tabulator, DISCO, Marble, others): 1. My Blog Data Space Tag Cloud 2. My Shared Bookmarks Tag Cloud 3. My Feeds Subscription Tag Cloud uHi Vuk, thank you for your prompt action. Yes, as far as I know, RSS readers respect vocabularies they know (e.g. RSS 1.0, DC etc.), and just ignore those they don't. Actually, Bloglines and Google reader can aggregate the revised Faviki RSS without any problem. Ah, well, t:associatedTag is usually used to relate a context dependent Tag (e.g. tags in del.icio.us, whose meaning might be different for each user). MOAT could be used to give a global meaning to such Tag. Here, however, Faviki uses DBpedia resources as the tag, whose semantics are already defined. I guess you don't need to further define the meaning of them with MOAT. (It could be debatable whether t:associatedTag is really appropriate here, though) Yes, sorry (I had Tag ontology in mind). Leave it. RSS readers rely on tag to find bookmarked document's URI. Although looks redundant, each serves different purpose. I recommend to use both taggedResource and link. Sure. If you do not use terms from that namespace, you can safely remove such xmlns: declaration. best, uHi Kingsley, Kanzaki Thank you very much for help. I must admit I am little confused by the contradictory advices I get from the semweb community. I am learning about using MOAT, SCOT, SIOC and other ontologies and hope I will be able to implement them soon. Best regards, Vuk Milicic On Mon, Jun 2, 2008 at 2:59 PM, Kingsley Idehen < > wrote: uVuk Milicic wrote: Vuk, Semantic Web community is large! I am specifically of the Linked Data tribe within the Semantic Web nation :-) What are you finding to be contradictory? Kingsley uFor example, regarding the question if there is need for MOAT or not, because Faviki uses only semantic tags, and should it use associatedTag to denote the DBpedia resource, and what ontologies sould be used I was a bit frustrated because it seems there are many things that are debatable, and many ways to implement the same thing. It would be much easier to have one accepted way of doing things, at least from the perspective of someone who is still learning about it:) However, I learned more about the stuff (it took some time), and things are much clearer now Thanks and sorry for my delayed response, Vuk On Fri, Jun 6, 2008 at 6:21 PM, Kingsley Idehen < > wrote:" "distributed-extraction-framework and distributed extraction in general" "uHello, If I understand well the distributed-extraction-framework is meant to run the extraction in Spark. However it does not directly depend on extraction-framework and the code is a bit old (the last commit is August last year). Is it something planned for the future? (i.e. running the extraction in a distributed manner). If yes, when roughly distributed-extraction-framework will be updated with the newest code from upstream? Thanks a lot!! Regards," "DBpedia: limit of triples" "uOn 9. Aug. 2011, at 12:26, Jörn Hees wrote: sorry, just noticed that .ntriples seems to be an exception here (gave the link for better readability) in contrast to the default xml-rdf: n3 and json also show the problem: Jörn uHi, see the \"Limitations on browseable data\" section on That section pretty much covers your 2nd remark, the one afterwards your 1st one. But yes, i also consider DBpedia buggy in this sense (hence the crossposting): I guess that many newcomers won't even notice that they might not have gotten any sensible triples at all. It seems that internally the URI dereferencing results are capped at a 2001 triples max. At the same time the incoming (/ reverse) links are given first, then the outgoing links. For example, if you dereference I also guess it would be better to construct the given document first from the outgoing triples, maybe preferring the ontology mapped triples, and then incoming links up to a 2000 triples limit (if necessary to limit bandwidth). That would fit the description in the above mentioned section way better than the current implementation. Best regards, Jörn On 8. Aug. 2011, at 17:15, Basil Ell wrote: uJust a small note. I think you mean that the SPARQL engine behind a particular deployment of DBpedia is behaving differently from what you would desire. Although there are bugs in DBpedia, this is not one of them. :) I think it is important to make this distinction between DBpedia and the SPARQL endpoints serving its contents exactly to point out that you could provide your own implementation/wrapper that sorts/limits results the way you want. With more deployments, everybody benefits from the collaboration and the competition. On the one hand maybe the traffic will be more evenly spread, and on the other hand one improvement by a player may push other players to keep pace. I would be curious to hear about advantages provided by one endpoint over another, for example. Has anybody done that? Cheers Pablo On Aug 9, 2011 12:30 PM, \"Jörn Hees\" < > wrote: >'yes, i also consider DBpedia buggy in this sense (hence the crossposting)' Just a small note. I think you mean that the SPARQL engine behind a particular deployment of DBpedia is behaving differently from what you would desire. Although there are bugs in DBpedia, this is not one of them. :) I think it is important to make this distinction between DBpedia and the SPARQL endpoints serving its contents exactly to point out that you could provide your own implementation/wrapper that sorts/limits results the way you want. With more deployments, everybody benefits from the collaboration and the competition. On the one hand maybe the traffic will be more evenly spread, and on the other hand one improvement by a player may push other players to keep pace. I would be curious to hear about advantages provided by one endpoint over another, for example. Has anybody done that? Cheers Pablo On Aug 9, 2011 12:30 PM, 'Jörn Hees' < > wrote: uOn 9. Aug. 2011, at 13:15, Pablo Mendes wrote: Yes, this was imprecise. I was not talking about the SPARQL endpoint (which in fact is able to return more than 2001 triples per subject). I was talking about the standard thing that many people do with a http URI: dereference it. I agree that other / local SPARQL endpoints are useful for mass queries and to take load of the DBpedia servers, but i don't see how they help in my case, as dereferencing still goes to the server(s) at dbpedia.org. Cheers, Jörn uHi The [SPARQL] ResultSetMaxRows = 2000 MaxQueryExecutionTime = 120 MaxQueryCostEstimationTime = 1500 These are in place to make sure that everyone has a equal chance to de-reference data from dbpedia.org, as well as to guard against badly written queries/robots. The following options are at your disposal to get round these limitations: 1. Use the LIMIT and OFFSET keywords You can tell a SPARQL query to return a partial result set and how many records to skip e.g.: select ?s where { ?s a ?o } LIMIT 1000 OFFSET 2000 2. Setup a dbpedia database in your own network The dbpedia project provides full datasets, so you can setup your own installation on a sufficiently powerful box using Virtuoso Open Source Edition. 3. Setup a preconfigured installation of Virtuoso + database using Amazon EC2 (not free) See: Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 9 Aug 2011, at 13:04, Jörn Hees wrote: uURI: dereference it. Oh, sure, my bad. Made a leap there because of my assumption that the triples are being generated by a SPARQL DESCRIBE. But anyways, I'm happy to hear that linked data is already the standard thing to do with a URI. :) Not too long ago it was just a longer and weirder ID as compared to auto-increment PKs. But even for dbpedia.org/resource, my suggestion of multiple deployments may apply. One could think of load balancing between a few key providers. I'd be excited to see that happening and to observe its implications. Cheers, Pablo On Tue, Aug 9, 2011 at 2:04 PM, Jörn Hees < > wrote:" "'2007-01-01T00:00:00+01:00' is not a valid value for datatype XMLSchema#gYear" "uHi all, try this simple query against the DBpedia SPARQL endpoint: SELECT * WHERE { ?y } You will get: \"2007-01-01T00:00:00-04:00\"^^ However, as you can check here ( xsd:gYear data-type. It should be just 2007. Similarly, if you check this RDF data ( This causes RDF reasoners to raise an error. Curiously, if you go to the HTML version of the same resource ( data-type between parentheses: 2007-01-01 00:00:00 (xsd:date). Is it possible to correct this error before the new DBpedia release? If you want, I can post the issue again on the DBpedia-developers mailing list. Cheers, roberto uI've noticed there's also an open ticket for such issue: cheers, roberto Il 31/05/2012 11:35, Roberto Mirizzi ha scritto: uHi Roberto, thanks for the report. These problems should be fixed. You can always check what the latest code does with the DBpedia extraction server: For example, this page contains the triple \"2007\"^^ . Cheers, JC On Thu, May 31, 2012 at 11:35 AM, Roberto Mirizzi < > wrote:" "procedural suggestions for mapping wiki" "uHi everyone! Being a newcomer to the dbpedia mapping wiki, I'm discovering the power of the idea. But one thing I'm missing from other wikis, is DISCUSSION about editorial decisions. Notably on wikidata: - there's a lot of discussion before a property is introduced, e.g. see - there's a lot of rich metadta, including discussion but also validation rules etc: Eg see DNB identifier: Having such discussions on this mailing list is suboptimal, to say the least. So I have the following procedural suggestion: - make a project - for every bug or suggestion, use the appropriate property's Discussion page to decide what to do, and to document it. E.g. for gender: I also miss some editorial templates that would be very useful for these discussions, e.g.: {{support}}, {{oppose}}, {{question}}, {{comment}}, {{ping}} (But {{comment}} is taken for rdfs:comment) Note: I've only been on wikidata for a week but already found them very useful. I'm sure that others with more editorial experience can suggest more (or perhaps a whole package that can be \"imported\"). Cheers! Vladimir uOn Mon, Dec 22, 2014 at 6:23 PM, Vladimir Alexiev < > wrote: Hi Vladimir, This is exactly what we plan to do :) We plan to port the DBpedia ontology in WebProtege and track ontology issues/changes on github. The German chapter is working on this and we'll announce it when it's ready. The mappings wiki will be used only for defining template mappings, the ontology in the wiki will be in read mode and a bot will keep it in sync with WebProtege. The problem here is that the Wikidata community is much bigger than the DBpedia mapping community and most of then have a \"Wikipedia editor\" background while our's probably does not. We welcome all suggestions but we also need someone to lead this effort. If you are up to this task you have a go from me and I am sure from many others. We can facilitate this by giving you, or any other who wants to join additional permissions if needed. Cheers, Dimitris uVladimir> So I have the following procedural suggestion: Dimitris> This is exactly what we plan to do :) Ok. Let's track the ontology & mapping issues in the same project on github. Could you please make that project (e.g. I don't know the second thing about MediaWiki. I've only been using it a year (ashamed to admit it). But we can collaborate even without such goodies. As soon as you make the tracking project, I'll write a simple procedure on the front page. Cheers! and thanks for your efforts!!" "DBPedia freshness" "uCan someone let me know which Wikipedia data dump file it is that is the input to DBPedia? On the Wikipedia SQL-Dump\". Is it this one we talk about? Or is it another file that is being used as input into the DBPedia system? Also, I see that the latest dump of DBPedia is 8 months old (from October 2008). Is there anything preventing DBPedia to create a fresher dump from the data at I'm curious to know if the reason the data is not fresh is an issue with that someone actually has to manually download the Wikipedia data and run the scripts (and it has just not been done yet), or if the issue is technical somehow and that it has failed with newer data? Thanks /Omid uHi Omid, there are several Wikipedia dump files we are importing in order to extract the data for DBpedia (see the importwiki.php in the DBpedia SVN). It is true that DBpedia is quite out of date at the moment. There has been a lack of Wikipedia dumps during winter and spring, but Wikipedia recently started to publish dumps much more frequently. We are currently in the process of preparing DBpedia 3.3, based on a late May dump of the English Wikipedia (and dumps of other languages around that time). I can only roughly estimate when DBpedia 3.3 will be available, but keep an eye on the DBpedia mailinglist around end of next week Cheers, Georgi uThanks Georgi, I have also noted that Wikipedia significantly has increased the frequency with which they are releasing their dumps. I remember there was a period from October 2008 to early this year when no new dumps were completed for 5-6 months time. The question is, how much manual work and how long processing time is there for DBPedia to release a new dump once a new Wikipedia dump is released. Assume that Wikipedia would start releasing complete data dumps on a daily basis, would DBPedia theorietically be able to release dumps also on a daily basis? Or is the processing itself require for example one week of processing making impossible to have DBPedia daily fresh even if Wikipedia would have their data dumps daily fresh. Basically I try to figure out what the minimum delay would be from a new Wikipedia dump is released to that a new DBPedia is released is with the current DBPedia scripts. Also, if the process currently involves many manual steps (to download Wikipedia dump, process the data etc.), is it something that could very easily be automated so that keeping DBPedia fresh would not involve any human intervention? Thanks /Omid On Thu, Jun 18, 2009 at 12:20 PM, Georgi Kobilarov< > wrote: uOh, I thought the Wikimedia Foundation would have offered DBpedia a free live feed by now. On Thu, Jun 18, 2009 at 1:20 PM, Georgi Kobilarov < >wrote: uHi Omid, it is true (as Brian wrote) that the Wikimedia Foundation has offered us their Wikipedia live feed. And we are in the process of developing and deploying a real-time update version of DBpedia. Jens also wrote about that recently on the mailing list. The reason that we need some time for preparing DBpedia 3.3 is that there have been a bunch of changes to the DBpedia extraction framework, some experiments with the code base, and we simply need to get things together again. Usually, it takes around 1 day to process a whole Wikipedia dump, including importing it into MySql and running the extraction. But due to the bunch of code changes, we were facing some bugs and extraction errors. Nothing to worry about, but it requires time to get up to speed again. And since DBpedia still is kind of a spare time project for all participants among our other research projects, we don't always find the time to work on it. In the long-term, DBpedia will probably only rely on the Wikipedia update feed instead of the Wikipedia dump files. There will be daily or weekly diffs, monthly full dumps and hopefully a DBpedia live feed as well. I hope that sounds reasonable. If not, please let us now. Cheers, Georgi uIt is great that DBPedia will get access to a live feed and can generate the data dumps much more frequently to keep it fresh. Daily or weekly freshness sounds reasonable to me. Is this feed something DBPedia specific you will get access to or is it something everyone can access? Is this feed the same thing as these IRC feeds? On Thu, Jun 18, 2009 at 2:47 PM, Georgi Kobilarov< > wrote: uwe are using a password protected Wikipedia feed, which serves all changed articles with their full markup text in real-time. The difference to the freely available IRC channel is, that the IRC channel only serves the URIs of changed articles, but not their text. So one would have to get the article text from Wikipedia via this API: And Wikipedia does block IPs which hammer too much on that API. And the live feed also includes a nice protocol and mediawiki plugin support for setting up a local up-to-date mirror of Wikipedia. Cheers, Georgi offered things as" "Multilingual labels in dbpedia" "uHi all I have two questions regarding multilingual labels in dbpedia. First question is political, sort of : the choice of language tags. I understand it is based on the list of wikipedias ordered by number of articles. This is an arbitrary but defendable choice in dbpedia framework. But in this case, why does Chinese (zh) appear but not Russian (respectively n°12 and n°11 at Another choice to consider would be the number of speakers, as defined either at which provides a quite different order indeed! No conclusion at the moment. Just to mention it \"en passant\" :-) Second question is more technical. All labels with non-ASCII characters seem to suffer some encoding issues. In the html rendering all such labels end with a \" character, for example some labels for * Frans (nl) * Franska (sv) * Französische Sprache\" (de) * Français\" (fr) Using SPARQL query what I get for the latter label is as following Fran\u00E7ais\\" (in the html answer) Français' (in the xml answer) Given that all Wikipedi pages are correctly encoded in UTF-8, if I judge by what I got in my extraction for lingvoj.org there is a bug somewhere in the label processing from dbpedia side. Bernard" "ordering of multiple values for an object in Virtuoso" "uHi Tim, Some properties may be multi-valued, like founders of a company. And the ordering of these founders are significant, so my question is if Virtuoso can remember values' insert order and keep their query output in the same order as insertion. E.g., The following three N-triples are inserted one by one (A is the most significant founder, C is the least significant founder). \"A\" \"B\" \"C\" Now, I want to query the graph to know who are the founders of company1. I expect the output would be A, B and C (in the same order as they were inserted). By default, I knew the store don't care values' order. Do you know how to make it happen by some other means? Object create timestamp is a possible solution I can come up with right now.But how to get this info in Virtuoso? Thank you in advance. Any tips on that would be greatly appreciated!" "opencyc link accuracy" "uhi dbpedia, i've been really excited with the opencyc links, and am very impressed with what must be a very difficult reconciliation project. The newer links though maybe, are less accurate? has the algorithm changed since the summer? I did a manual check of a (admittedly small) sample and found a 10% error rate- http://dbpedia.org/page/Coil http://dbpedia.org/page/Charge-coupled_device I'd like to get involved in this project, would it be helpful if I made a larger list of found (or suspected) errors? I'm not clear on how manually-editable either dbpedia or opencyc is. cheers ;) uSpencer Kelly wrote:" "I18N Redirects / was: AW: DBpedia 3.7 released, including 15 localized Editions" "uCongratulations to the internationalized release. It offers exciting opportunities for linking to third language vocabularies. For this purpose redirect files (such as redirects_en.nt) are extremly useful, since they offer lots of possible synonym names for lookup. However, neither on the download web page nor in the download server directory, I found such redirect files for e.g., German. Do you consider providing such files? Cheers, Joachim" "How i can extract" "uHi, I would like to know how I can extract the data on Wikipedia FR which correspond at this infobox : Because each time that I try to operate the tool of extraction, it doesn’t work. Thanks in advance. Julien." "Writing parser rules for Generic Extractor" "uHello, I'd like to improve the parser for the company infoboxes. In particular, the extraction for dbpedia-owl:keyPerson on the BMW article is flakey - I'd like to change to parser to ignore anything between tags which should improve accuracy. If I interpret the .xls correctly, then key_people is derived from Organisation and the parsing rule for organisation is used. I assume I have to change rules.xsl to specify a new class for keyPerson and then add code for my new rule somewhere in extractors/infobox/. Can someone give me a an example or some documentation how I would go about this? Or have I missed anything in the existing documentation? Regards, Michael Haas uHi Michael, the mapping.xls is for mapping infoboxes to ontology classes and infobox properties to ontology properties. The rules.xls keeps track of exceptions to the definitions in the mapping.xls. Both files are documented in the SVN ontology/docs files. If you make any changes to the xls files you need to rerun the database import scripts in the SVN ontology directory. As defined in the mapping file (.xls) keyPerson on Organisation has the range Person. If you want to change the MappingBasedExtractor (former Generic Extractor) to ignore tags, you might want to use the Util class and its functions. For the keyPerson case find the corresponding code for object properties in the MappingBasedExtractor ( \"switch($property_type) {\" ) and define own replacements. The Util class offers a function called removeHtmlTags which might be useful. Hope this helps, Anja Michael Haas schrieb: uHello, Michael Haas schrieb: A small note in addition to what Anja said: We are currently working on a DBpedia live infobox extractor, which will use a different mechanism than the current mapping based extractor. However, we will take your changes into account if possible. Kind regards, Jens uJens Lehmann wrote: Thanks for letting me know. Are you going to use the OAI harvester which requires a password? I'm currently using the DBPedia framework to extract the Company Infoboxes using the LiveWikipedia collection. What new approach are you going to be using? I assume it'll be different from just feeding a LiveWikipedia collection into the MappingBasedExtractor. As you can see, I'm slightly confused here as most of my information comes from Georgi Kobilarov's blog. I seem to remember him saying there will be a new interface to specify mappings for the current extractor. Where is DBpedia going these days? Going through the SVN commit logs is not exactly helpful for an outsider as many commits simply do not have a commit message. Regards, Michael uHello, Michael Haas schrieb: [] Yes. Yes, it will be different. We will let you know more once our ideas (and their implementation) are somewhat stable. :) Yes, but very recently we switched to the idea of maintaining the mappings directly within Wikipedia and found a way to (probably) achieve this. So Wikipedia itself will be the interface through which mappings can be specified. I agree that more log messages would definitely be a good thing and I just send a reminder to the developers currently using the DBpedia SVN. Kind regards, Jens" "Dbpedia from Protege" "uHello How can I use Dbpedia resources and properties using my protege ontology? I can run dbpedia queries using Virtuoso but do not know how can I use it using Protege. I could not found even a tutorial discussing things like this so if some one can provide a link to tutorial, I would be grateful. Regards Hello How can I use Dbpedia resources and properties using my protege ontology? I can run dbpedia queries using Virtuoso but do not know how can I use it using Protege. I could not found even a tutorial discussing things like this so if some one can provide a link to tutorial, I would be grateful. Regards uYou can import the DbPedia ontology in Protege (cf own ontology upon it. And/or you can query the DbPedia endpoint for data, using CONSTRUCT queries. And import the RDF query results in Protege. I have hacked a visual tool to interact with SPARQL endpoints. Cf (note: you can export the query results as RDF with this tool, and then import them in Protege). Envoyé de mon iPad Le 22 mars 2016 à 20:43, kumar rohit < > a écrit :" "Old DBpedia dumps and names across versions" "uHi, my name is Dario Garcia and within the context of my PhD research in AI I'm analyzing how pagelinks evolve in DBpedia. When trying to download the oldest versions found a couple of issues. The information given for the three oldest versions available of DBpedia is: DBpedia 3.1 Triples: 68.5M; Filesize(download): 454.4MB; Filesize(unpacked): 8.9GB DBpedia 3.0 Triples: 59.9M; Filesize(download): 394.8MB; Filesize(unpacked): 7.8GB DBpedia 3.0RC Triples: 69M; Filesize(download): 401MB; Filesize(unpacked): 9GB The first issue is with the oldest DBpedia, 2.0. The link to the pagelink file does not work and so I wonder if the file still exists, and if so from where can it be downloaded. The second issue is with the number of triplets. Howcome an older version (3.0RC) has more triplets than latter versions (3.0 and 3.1)? What changed in the middle? One last question, regarding all versions. Are names preserved throughout all version? Are there pages which name changes with time so that they are called differently in two different DBpedia dumps? I guess that may have happened in Wikipedia, and wondered if you were aware of it. That's all. Thank you for your time. Dario. uOn 30 April 2014 20:26, Dario Garcia Gasulla < > wrote: Probably here: pagelinks.tar 02-Apr-2009 20:32 368M I don't know. Names of Wikipedia pages sometimes change. When that happens, the IRI of the corresponding DBpedia resource also changes. That's a bummer. Since release 3.5 (April 2010), there's a workaround: DBpedia now also extracts the Wikipedia page ID [1], which does not change when a page is renamed. RDF triples for the page IDs are published in the \"page_ids\" files. To find out which DBpedia resource names have changed from one version to the next, you could write scripts that analyze these files: look for lines that have the same page ID in both files but different URIs. Maybe someone has already done that, I don't know. Could be done with a few lines in bash. Regards, JC [1]" "sameAs links in DBpedia 3.7 dump" "uHi there, What happened to the interlanguage-links file of DBpedia3.7 dump ? There is a \" interlanguage_links_en.n3.bz2 \" file that contains only links to 'el' and 'de' chapters. I am running a local image of DBpedia en with 3.7 dump and need to add sameAs links. I tried with DBpedia 3.8 interlanguage links file but the encoding seems to be different, at least for the parenthesis. for DBpedia 3.7 for DBpedia 3.8 Is that the only change in encoding ? Cheers, Julien p { margin: 0; } Hi there, What happened to the interlanguage-links file of DBpedia3.7 dump ? There is a \" interlanguage_links_en.n3.bz2 \" file that contains only links to 'el' and 'de' chapters. I am running a local image of DBpedia en with 3.7 dump and need to add sameAs links. I tried with DBpedia 3.8 interlanguage links file but the encoding seems to be different, at least for the parenthesis. < Julien uHi Julien, On 01/21/2013 02:47 PM, Julien Cojan wrote: You can find more information about the encoding differences between DBpedia 3.7 and DBpedia 3.8 here [1] [1] URIencoding uThanks Mohamed, Julien" "Distance in DBpedia" "uHi everyone, i trying to find a way to calculate the distance between two resurces withing dbpedia. I thought i could make a sareis of joins in sparql like this SELECT ?1 ?2 ?3 WHERE { { ?p1 ?1.} UNION {?1 ?p2 .} {?1 ?p3 ?2.} UNION {?2 ?p4 ?1.} {?3 ?p5 ?2.} UNION {?2 ?p6 ?3.} { ?p7 ?3.} UNION {?3 ?p8 .} } until i found at least one way between the resources, but as i thought this kind of approach is really too heavy to compute and starting from 3 steps or more it will result in a timeout from the server executing the sparql query. So i was reading this article: Discovering Unknown Connections – the DBpedia Relationship Finder where are described two algorithms: the first one is a clustering one witch divides the whole rdf triples in clusters of conncted sub-graphs and assigns a distance value from a single random resource to all the others. The other one does more or less what i'm trying to do, calculating the ways that connect two nodes (also calculating the minimum distance and the maximum distance as absolute value of the difference and the sum of the relative distances to the central resource), but there's this instruction: \"formulate SQL query for obtaining at most (n ô€€€ m) connections between O1 and O2 of length d without objects and properties in the ignore list;\" so it is similar to my original ideai really would have liked to see how it is implemented in an efficent way, so i downloaded the code but i was unable to run it bacause of a db problem (it requies a statement table and i donno where it cames from) and still can't find where this implementation is within the sourcecode. By the way, have someone tried to solve this problem? are there any kind of suggestion? I thought another possiblity could be the sequent: starting form the clsuterized rdf triples in the article, construct two trees with the 2 resoruces we want to find the distance as roots. Then each son is a connected resource with a distance from the central resource fo the cluster wich is less than root's distance from the center. At the n-th level (where n is the distance from the central resource of the root resource) we will have the central resource for sure. Then for each resource in the first tree we search for it in the second one. Then we save every matching reasource in a list with the sum of the distances from the roots in the two trees. Once every resource is checked, the resurce in the list with the smaller value is the resource that minimizes the distance between the two nodes. The worst situation is that there are no common resources in the trees other than the central resource. Maybe this works but it's a bit elaborate and probably hard to computemaybe there's a simplier way. Any suggestion? Thankyou, Piero Molino Hi everyone, i trying to find a way to calculate the distance between two resurces withing dbpedia. I thought i could make a sareis of joins in sparql like this SELECT ?1 ?2 ?3 WHERE { {< Molino uOk no one has a solution for my problem ^_^ I hope that Jens will answer this time becuase he is one of the authors of the article i cited in the previous message, so he knows best. I managed to find the functions i need in the relfinder sourcecode (they were in the index.php) and i realized how the database query are done, and as i thought they were practically a series of joins. By the way i can't get things working because o the statements table: can someone who tried this come tell me how to construct it? May i use the dbpedia csv dumps and import them in a mysql table like this: ( `subject` varchar(255) collate latin1_general_ci NOT NULL, `predicate` varchar(255) collate latin1_general_ci NOT NULL, `object` varchar(255) collate latin1_general_ci NOT NULL, `id` int(10) unsigned NOT NULL, PRIMARY KEY (`id`) ) ? (this is the code of the CopyTable from the relfinder sourcecode) I hope someone will answer this time :( Thanks anyway, Piero Molino uHello, Piero Molino wrote: At the time we wrote the Relationship Finder, the statements table was easy to create. You just had to download the csv file of the DBpedia release and load it into your database. Now things have changed a bit since then. You have to perform slight modifications of the extraction code. I prepared a csv file for you here: In a next step, you have to load the data into your DB, which can be done using e.g. this PHP script on the command line: $connection = mysql_connect('localhost',$user,$pass,true); mysql_select_db('dbpedia_relfinder', $connection) or die(mysql_error()); mysql_query(\"DROP TABLE IF EXISTS statements\") or die(mysql_error()); mysql_query(\"CREATE TABLE `statements` ( `id` int(10) unsigned NOT NULL auto_increment, `subject` varchar(255) collate latin1_general_ci NOT NULL default '', `predicate` varchar(255) collate latin1_general_ci NOT NULL default '', `object` text collate latin1_general_ci, `object_is` char(1) collate latin1_general_ci NOT NULL default '', PRIMARY KEY (`id`), KEY `s_sub_pred_idx` (`subject`(200)), KEY `s_pred_idx` (`predicate`(200)), KEY `s_obj_idx` (`object`(250)) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;\" ) or die(mysql_error()); mysql_query(\"LOAD DATA LOCAL INFILE 'infobox.csv' IGNORE INTO TABLE statements\") or die(mysql_error()); ? The second step is computing the components of the RDF graph. To do this, you have to execute cluster_main.php on the commandline. This can take hours or even days depending on your machine. Regarding the queries, you are right that they are basically joins. We can very easily detect whether two resources are in the same component of the graph and - as you read in the paper - we can also efficiently give a minimum and maximum value for the distance between two resources. The hard part is to detect the exact distance. Using MySQL, we found that joins performs quite reasonable if the distance is below 8 (if I remember correctly). In the meantime I have also seen other (relatively recent) approaches to compute the distance between resources, which are particularly targeted at large graphs. However, I do not have a handy literature reference available. Currently, we are thinking about reviving the DBpedia Relationship Finder and are looking at ways to provide this tool without the involved maintenance overhead of keeping it up-to-date. This means that we will probably use SPARQL queries against Virtuoso. This approach works well for distances up to 3/4. (You can try SPARQL queries against the official DBpedia endpoint to test this.) Kind regards, Jens uHello, Piero Molino wrote: This looks like it should work, but performing UNION and JOINS efficiently is probably very hard for Virtuoso in this case. Using LIMIT 1 and maybe a filter !isLiteral(?something) may give you better performance. You already found out that the implementation is in the DBpedia subversion in the /relfinder directory. One reason why the queries are more efficient than the one above is that we use a precomuted table storing all triples \"in both directions\", i.e. for each triple S-P-O we also store O-P-S. This way we do not need the union you mentioned above and MySQL can better optimise the joins on the statements table. We obviously also filter all cases where S or O are literals and remove some properties we do not want when computing the connection between two objects. Getting all nodes with smaller distance from the root node and connected to a specific node is computationally not a problem. What is likely to be a problem is the number of nodes you have in those two trees, which may grow very fast. Another problem is that you won't have a guarantee to get the distance this way: Say you have a complete graph with three vertices a, b, root. If you ask for the distance between a and b using your method it would return 2 instead of one. (There are also other counterexamples.) The reason is that going towards the randomly chosen root node of a component does not necessarily mean that you are walking on a shortest path. Kind regards, Jens uIl giorno 20/mar/09, alle ore 08:44, Jens Lehmann ha scritto: That's fantastic! thank you really much, i really appreciate your help! Ok i'll try it later today so that i can leave the computer calculate and enjoying himself :) Yes that is clear from the paper and is a really clever idea. In my work i have to calculate a distance factor within a [01] range. In the article there's a graphic that show the distace distribution, but it is clearly told that it has to interpreted cautiously because it is taken from a randomly a randomly selected node. There's also reference to future work for a comprehensive analysis. Have you already published the results of this analysis? They would be etremely useful for my work and i will surely cite them in the bibliography. If no, can you suggest me a maximum distance indicator so i can get values in [01] by dividing the obtained distance? If it happens that ou find sme literature about it, please tell me :) Yes that's the same reason why i tried to find a virtuoso query to do it (in the previous mail) so it would have worked on local virtuoso servers with dbpedia dump but also on the main dbpedia sparql endpoint. As i told i will try those queries to find the maximum distance that makes them resolve in an acceptable time, and then decide what to implement. Thankyou Jens, you probably don't know how helpful have you been. I'm really grateful. Once i finish m work i will release it under GPL so i hope that it would also be interesting for you as your work is beeing useful to me. Piero uIl giorno 20/mar/09, alle ore 09:08, Jens Lehmann ha scritto: As the objective is to find if there's at least one way of such a number of steps, the idea of limiting to 1 is perfect! Now i will try it and find to witch distance it has still acceptable performances. Yes, i wrote this email when i had a first approach to the code, now i spant some hours figuring out how it worked, so now it's really clearer, thanks for your explaination. What you say is surely correct, i wouldn't find if the nodes have distance 1 with the 2 trees i have described, but maybe working on the main idea there could find a working solution (maybe adding also the nodes at the same distance of the root? bythe way as i am short of timei think i'lluse what is granted to work) Thankyou for the clear answers and the help, Piero" "Query regarding dbpedia dataset" "uHi I aim at using Dbpedia for benchmarking. Can you please guide me as to whether dbpedia dataset contains reified facts. If it does support reification, then can you please give a small example illustrating it? I know that Dbpedia dataset contains information about named graphs with each triple. Can you please guide me regarding reification. I'll be highly grateful to you regarding the same. Also are there any freely available datasets in which triples are described using reification Hi I aim at using Dbpedia for benchmarking. Can you please guide me as to whether dbpedia dataset contains reified facts. If it does support reification, then can you please give a small example illustrating it? I know that Dbpedia dataset contains information about named graphs with each triple. Can you please guide me regarding reification. I'll be highly grateful to you regarding the same. Also are there any freely available datasets in which triples are described using reification" "recipie ingreients" "uWhy doesn't, for example, list the ingredients shown in the infobox at ? The latter uses an hRecipe microformat, BTW." "Query logs of DBpedia" "uHello list, I'm a phd student and I'm currently working on my project in the field of IR. For the evaluation I'm using the DBpedia dataset and therefore I'd like to use a set of real-world queries issued to DBpedia to see how the system behaves. Is such a set publicly available? Would it be possible to have one? Best regards, Claudio uHi,   I am also interested to these kind of queries on DBpedia.   Best regards Ibrahim uHello, Am 28.01.2011 18:28, schrieb Claudio Martella: An anonymous log excerpt for DBpedia 3.5.1 is available here: ftp://download.openlinksw.com/support/dbpedia/ Kind regards, Jens" "No german/i18n redirect files" "uHi, Does anyone have an idea, why there are \"no\" i18n redirect files (I only checked de/ca/fr) for 3.7? I also found this mail which didn't get a reply: I need this file to create the surface forms/index for Dbpedia Spotlight. Since I'm not familiar at all with the extraction framework, is there any way to create those on my own? Or even better, to fix the extraction configuration (if it's not on purpose)? Cheers, Daniel uHi list, Thanks for the reply and sorry for this delayed message. Maybe I shouldn't have said \"is there any way to create those on my own?\" since my resources are very limited at the moment. :) And since my Scala skills are not really measurable, all I could do is poking around in the dark. I would consider this a bug and I've created a bug report @ Maybe, somebody who created the german redirects could share them? (Until this is fixed) This would be really awesome! Have a nice weekend! Cheers, Daniel uHi Daniel, This is not a bug. The 3.7 release does not include redirects for non-English wikipedias. You can easily create them yourself by adding 'org.dbpedia.extraction.mappings.RedirectExtractor' under the 'extractor.de' (in the dump configuration file) and run the extraction framework for German. Cheers, Dimitris On Sat, Oct 15, 2011 at 7:05 PM, Gerber Daniel < > wrote: uHi Dimitris, I understand that your resources are limited, and you have to draw a line up to which you can provide the files people would like to download from dbpedia.org. However, when I try to use Linked Open Data from multiple datasets, for me it is a big hurdle to install a (basicly unknown) framework for just one dataset in order to twitch the extraction of source data a little bit. It just doesn't scale up for many datasets, with (too) limited resources on my side. Plus, if I would have extracted the data myself, this wouldn't have helped Daniel and possibly others who looked up these files on the download page. So if there is a chance to automate the process for one of the next releases, it would be really great. Nevertheless, thanks a lot for all the work you put into providing all this great data! Cheers, Joachim uHi Daniel, I'm sorry for this rather late reply but I was in vacation for 2 weeks. Limited resources are one of the reasons country-specific DBpedia versions have been made available. You'll find the version for the German language dataset under find the German redirects file: that might interest you under: Kind Regards, Alexandru Todor On 10/17/2011 12:44 PM, Neubert Joachim wrote: uHi Alexandru, Great to learn - thanks a lot! Joachim" "Issue in Maven execution" "uHi, I think I have found an issue with Maven. When you run maven from \"dump\" directory an old mapping and ontology version is retrieved whereas if you run maven from \"root\" directory the last version of them is retrieved. By example from \"dump\" directory : *extraction-framework/dump$ mvn scala:run -Dlauncher=extraction -DaddArgs=extraction.default.properties* Here the log : *[INFO] uHi Julien, in one case it is reading the file from the filesystem in the other case it is downloading it from the mappings server. If you check org.dbpedia.extraction.dump.extract.ConfigLoader you will see that: //language-independent val private lazy val _ontology = { val ontologySource = if (config.ontologyFile != null && config.ontologyFile.isFile) { XMLSource.fromFile(config.ontologyFile, Language.Mappings) } else { val namespaces = Set(Namespace.OntologyClass, Namespace.OntologyProperty) val url = new URL(Language.Mappings.apiUri) val language = Language.Mappings WikiSource.fromNamespaces(namespaces, url, language) } new OntologyReader().read(ontologySource) } I suppose there are problems with the maven scala * -DaddArgs=./dump/extraction.default.properties* Can you try to double-quote the whole string? e.g. \"* -DaddArgs=./dump/extraction.default.properties\"Regards Andrea 2013/6/27 Julien Plu < > uOn Thu, Jun 27, 2013 at 1:28 PM, Andrea Di Menna < > wrote: Andrea is correct but the problem lies in the contents of the property file # if ontology and mapping files are not given or do not exist, download info from mappings.dbpedia.org ontology=/ontology.xml mappings=/mappings The paths are relative and when you are in root dir, it cannot locate the files and downloads them from the server Cheers, Dimitris uHi, Thanks for your lights :-) Personnally I think the process will be more logical if first we try to download directly the last version of the mapping and ontology and if it's not possible using the local files. No ? Best. Julien. 2013/6/27 Dimitris Kontokostas < > uThis is of course subjective but, having the option for a local cache is also good. Cheers, Dimitris On Thu, Jun 27, 2013 at 1:45 PM, Julien Plu < > wrote: uHaving a static local version of ontology and mappings is good for the dump extraction that takes several days and may have to be restarted. Things would be unpredictable if we always used the latest version from the wiki, so we download the stuff once before we start the extraction. On Jun 27, 2013 2:42 PM, \"Dimitris Kontokostas\" < > wrote: uAs for the original question - don't run Maven from the root directory. :-) It's strange that it's working at all. On Jul 22, 2013 9:23 PM, \"Jona Christopher Sahnwaldt\" < > wrote: uOk ! But I think it will be good if the possibility of downloading the last version of the mapping (maybe by an option) will be a good feature. What do you think of this ? Really simple example of usage : I recently passed many hours to make french mapping, if I did it it's mainly so that me and other people can enjoy it directly in their extraction without waiting the new DBpedia release. Best. Julien. 2013/7/22 Jona Christopher Sahnwaldt < > uQuote from # if ontology and mapping files are not given or do not exist, download info from mappings.dbpedia.org ontology=/ontology.xml mappings=/mappings On Jul 22, 2013 9:55 PM, \"Julien Plu\" < > wrote: uAh true ! I didn't remember of this option in the conf file, sorry :-( 2013/7/22 Jona Christopher Sahnwaldt < >" "Categories present in article_categories_en.nt but not in page_links_en.nt ?" "uHi DBPedia-Community, I'm currently writing my Master-Thesis in the field of DBPedia and SPARQL. One of my subgoals is to find out how many categories are present in both Wikipedia and DBPedia. Therefore, I wrote a little tool which identifies all categories having at least one resource in the unspecific mapping based part of DBPedia (If I refer to DBPedia in this mail, I usually mean this part of DBPedia not the whole one.). It searches the file mapping_based_properties_en.nt and looks whether or not the object and subject of each statement is linked to a category in the file article_categories_en.nt. If there is a link, the tool considers the corresponding category to be 'present' in DBPedia. On the other hand, the same tool searches the page_links_en.nt file to find all categories of Wikipedia. That is, all triples which relate a resource to a category or (if present at all) a category to any object. According to the description of the 'Page Links Extractor' it 'Extracts internal links between DBpedia instances from the internal pagelinks between Wikipedia articles.'. As Wikipedia pages normally link to their categories, I assumed that these links are also included and, thus, all categories in Wikipedia are captured. Unfourtnately, this is only true for almost all categories. I found 127 categories which are present in DBPedia but not in Wikipedia, compared to 59099 categories present in Wikipedia and not in DBPedia. This is strange, as the set of DBPedia categories must be a subset of Wikipedia categories. Otherwise, some magic added some new categories during extraction and I doubt that. I made sure, it was not my fault and had a look on the data. One of the suddenly appeared categories is DBPediasian side, there is a triple ( .) which relates this category to the United states Senate election in Alaska in 1996. The resource itself is subject of two statements in mapping_based_properties_en.nt. On the Wikipediasian side, I did not find any triple in page_links_en.nt which contained the category. But I did find the United states senate election in Alaska in 1996 resource. The corresponding Wikipedia page also includes a link to the category. It is present since page creation. What is the reason for this ? * Is my assumption wrong, that the internal pagelinks also include links to categories ? * If yes * Why were almost all categories captured ? * Should I use the article_categories_en.nt file for Wikipedia, too ? * Did the Pagelinks Extractor skip corresponsing LinkNode during traversal of the AST ? * Does the extraction source miss this information ? I'm looking forward to your answers. Regards, Gregor Trefs Hi DBPedia-Community, I'm currently writing my Master-Thesis in the field of DBPedia and SPARQL. One of my subgoals is to find out how many categories are present in both Wikipedia and DBPedia. Therefore, I wrote a little tool which identifies all categories having at least one resource in the unspecific mapping based part of DBPedia (If I refer to DBPedia in this mail, I usually mean this part of DBPedia not the whole one.). It searches the file mapping_based_properties_en.nt and looks whether or not the object and subject of each statement is linked to a category in the file article_categories_en.nt. If there is a link, the tool considers the corresponding category to be 'present' in DBPedia. On the other hand, the same tool searches the page_links_en.nt file to find all categories of Wikipedia. That is, all triples which relate a resource to a category or (if present at all) a category to any object. According to the description of the 'Page Links Extractor' it 'Extracts internal links between DBpedia instances from the internal pagelinks between Wikipedia articles.'. As Wikipedia pages normally link to their categories, I assumed that these links are also included and, thus, all categories in Wikipedia are captured. Unfourtnately, this is only true for almost all categories. I found 127 categories which are present in DBPedia but not in Wikipedia, compared to 59099 categories present in Wikipedia and not in DBPedia. This is strange, as the set of DBPedia categories must be a subset of Wikipedia categories. Otherwise, some magic added some new categories during extraction and I doubt that. I made sure, it was not my fault and had a look on the data. One of the suddenly appeared categories is DBPediasian side, there is a triple ( < in Alaska in 1996. The resource itself is subject of two statements in mapping_based_properties_en.nt. On the Wikipediasian side, I did not find any triple in page_links_en.nt which contained the category. But I did find the United states senate election in Alaska in 1996 resource. The corresponding Wikipedia page also includes a link to the category. It is present since page creation. What is the reason for this ? * Is my assumption wrong, that the internal pagelinks also include links to categories ? * If yes * Why were almost all categories captured ? * Should I use the article_categories_en.nt file for Wikipedia, too ? * Did the Pagelinks Extractor skip corresponsing LinkNode during traversal of the AST ? * Does the extraction source miss this information ? I'm looking forward to your answers. Regards, Gregor Trefs uI'm not sure, but there are two possible explanation: * Dbpedia has Yago categories in addition to wikipedia categories * have you run your own Dbpedia extractor? Because if it's not so you could have searched the Dbpedia information that is several months old! uAs I see it, the YAGO categories are classes and, thus, used as types for resources. Wikipedia categories are refered to with the property and don't act as type. I did not find any yago classes in articel_categories_en.nt. I didn't run my own extractor, but downloaded the files from the dbpedia.org (3.7,September 2011). This is over a year after the category link was added to the wikipedia page (see my example). On 13.02.2012 14:27, Yury Katkov wrote: uOn 13 February 2012 13:01, Gregor Trefs < > wrote: Categories can also be added by templates. As Yury said, it's more likely that those articles have changed since the last extraction. As an aside, I originally thought that you were talking about some Asia-specific version of Wikipedia, and now put it down to some sort of interlanguage effect. If it's the latter, adjectives formed from English nouns ending -a typically have the ending -an (Wikipedia -> Wikipedian), but it's generally preferable, especially with proper nouns, to just use the noun as a modifier ('On the Wikipedia side'). page_links is meant to capture _normal_ wiki links found in the body of the text, article_categories is specifically for categories. $ bzgrep ' article_categories_en.nt.bz2 |grep ' I believe these are the triples you're looking for. If you find yourself wondering if you're looking in the right file, bear in mind that you can always use the website: $ curl ' ." "Italian short / long abstract problem" "uHello, In DBPedia 3.7, a big amunt of long and short abstract for the Italian language are messed. They start from the second sentence of the Wikipedia article, skipping the first one, so the abstract as a whole is of a little use as the subject is often unclear. Is possible to fix that problem ? Alessio uDear Alessio, Sorry, but this is actually not on the top of our Todo list. We could assist you a little in fixing the problem. Would you be willing to try it? Sebastian On 09/22/2011 02:02 PM, wrote: uHello Alessio, the fat that the abstract starts from the second sentence is probably due to the fact that many first sentences are generated from the templates in italian wikipedia. A clear example is the bio template for peolpe. So please take a look to the original wikipedia page source and see if this is the problem. If not, please give us some examples of the messed abstracts. Regards, Piero Molino Il giorno 22/set/2011, alle ore 19:06, Sebastian Hellmann ha scritto: uHi, Thanks for the reply. Yes, if you can assist me in this task this is fine. Which is the best/fast way ? Alessio uHi Piero, This seems to be a very common problem for the Italian dataset(s). I've done some random tests, and more than 50% of abstracts are messed (this is very random test so the numbers are only indicative). Just for example: uHello all, I think the first important step is to produce a test case. Can you provide cases, where the extraction fails? The best would be to have one file per article, that contains the original Wikipedia source and one file that contains the expected output. This will be a good basis to make regression test case in the future. After you provided these test cases, there are two choices: 1. you can wait until the next Wikipedia dump in 6 months, I guess that is the time, when we will try and fix the problem based on the test case you provided. 2. try to fix it yourself (with a little help of us) and make a new Italian dump (or the abstract data sets), which we will use to replace the current one. (This would probably be much faster and you would learn how to produce/tune the DBpedia data) All the best, Sebastian On 09/23/2011 09:45 AM, wrote: uHello, Option number 2 is the only way for us (time). So, if you've a link or a document about that step, it's a great way to start. I can produce test cases, just for example, the abstracts for Thanks, uHi Alessio, you are giving examples about biographies. As i said some email ago, in italian wikipedia biographies are generated from the template and not written as text. This is the example for Vasco Rossi: {{Artista musicale |nome = Vasco Rossi |nazione = Italia |genere = Hard rock |nota genere = [ |nota genere2 = [ |anno fine attività = in attività |note periodo attività = |etichetta = [[Lotus (casa discografica)|Lotus]], [[Durium|Targa]], [[Carosello (casa discografica)|Carosello]], [[EMI Italiana]] |tipo artista = Cantautore |immagine = Vasco Rossi 2.jpg |didascalia = Vasco Rossi |url = [ |numero totale album pubblicati = 25 |numero album studio = 16 |numero album live = 7 |numero raccolte = 2 }} {{Bio |Nome = Vasco |Cognome = Rossi |PostCognomeVirgola = anche noto come '''Vasco''' o con l'appellativo '''''Il Blasco''''' [ |LuogoNascita = Zocca |GiornoMeseNascita = 7 febbraio |AnnoNascita = 1952 |LuogoMorte = |GiornoMeseMorte = |AnnoMorte = |Attività = cantautore |Nazionalità = italiano }} Autodefinitosi ''provoca(u)tore'', [ I worked on an algorithm that tries to replicate the generation, and it works quite good even if it's a bit messy, but the real problem is: is this behaviour, generating text from bios, common in other categories of italian wikipedia pages? You should show examples that are not \"people with biography\" so we could realize that. I was also thinking about asking wkipedia italy directly about that, maybe they can give detailed description of the issue and maybe give some code for the generation. Regards, Piero Molino Il giorno 27/set/2011, alle ore 09:59, < > ha scritto: uAlessio, I will go ahead and guess that a good starting point is here AbstractExtractor [1]. I believe you have to install MediaWiki in your machine and load Wikipedia. This extractor will connect to MediaWiki and ask for a rendered page (with templates resolved), then it will extract the abstract. It seems that the class uses HTTP requests for this. It sounds to me as a waste of resources, so I'll also suggest that you change the code to just call PHP directly [2]. That being said, I have to disclaim that all my knowledge about this is based on overhearing conversations while brewing coffee. :) I will also dare to give another idea. The guys behind Sweble ( of activity behind it. Maybe it is worth trying an alternative abstract extractor based on their library that would go straight to the dump, render pages and grab the abstracts for you. If this worked, it would be a step forward with regard to installing MySQL, MediaWiki, etc. They have a nice demo of the parser here: Cheers, Pablo [1] [2] On Tue, Sep 27, 2011 at 11:34 AM, Piero Molino < >wrote: uSome more info for the (current) abstract extraction processYou will have to install a local modified mediawiki and load the wikipedia dumps (after you clean then with the script) The detailed process is described here: This could be a new approach to the framework, not only for abstracts, but to replace the SimpleWikiParser. I think the current parser is LL and maybe we could change to an LR Parser to handle better recursive syntax. I haven't checked at sweble yet, but we could look into it Cheers, Dimitris uDimitris, Cool! Maybe we could test Sweble first as the new AbstractExtractor, since it seems to be the weakest link? If it works for that, then it could be gradually introduced in the core to substitute SimpleWikiParser. Alessio, if you take the challenge, please keep us updated about your progress on (btw, let's move this discussion there?) Cheers, Pablo On Tue, Sep 27, 2011 at 1:19 PM, Dimitris Kontokostas < >wrote: Dimitris, Cool! Maybe we could test Sweble first as the new AbstractExtractor, since it seems to be the weakest link? If it works for that, then it could be gradually introduced in the core to substitute SimpleWikiParser. Alessio, if you take the challenge, please keep us updated about your progress on (btw, let's move this discussion there?) Cheers, Pablo On Tue, Sep 27, 2011 at 1:19 PM, Dimitris Kontokostas < > wrote: Some more info for the (current) abstract extraction processYou will have to install a local modified mediawiki and load the wikipedia dumps (after you clean then with the script) The detailed process is described here: Dimitris" "Missing DBpedia resources ?" "uHi, I've noticed a couple of examples where there seem to be entries at dataset I get no results returned e.g. returns a page with data but running : SELECT ?predicate ?object WHERE { ?predicate ?object . } at the SPARQL interface returns no results. Could someone explain why ? Thanks, Rob uHey Rob, Interesting one Yes, I can. There simply are no triples matching the pattern {dbpedia:iPhone ?p ?o . } The output pages (dbpedia.org/page/ and dbpedia.org/data/) are generated on-the-fly via a \"DESCRIBE \" query, which is equivalent (in Virtuoso's implementation, no necessarily by spec) to SELECT * where { ?p ?o . } UNION SELECT * where { ?s ?p . } In your example, there are only S P dbpedia:iPhone triples available. If you look at dbpedia.org/page/iPhone you will see, that the properties are named \"is blabla of\", which indicates that those triples match to { ?s ?p dbpedia:iPhone .} Apple's mobile phone you are looking for has the URI really an interesting example, because the URI does not match the article's title in Wikipedia (which has a lower case \"i\"). Best, Georgi uP.S.: To find resource URIs in DBpedia, try Georgi uOn 26 Apr 2008, at 00:14, Georgi Kobilarov wrote: Interesting indeed! It turns out that the Wikipedia article has the uppercase letter: But the article markup contains this: {{lowercase}} I think this makes the first letter of the title show in lower case, so we get “iPhone” as the title. The DBpedia extractor code probably doesn't know about {{lowercase}} and therefore doesn't handle this correctly. Best, Richard uGeorgi Kobilarov wrote: Thanks, that makes sense now :) This raises another question then - why do the redirects all point to the 'iPhone' resource and not the 'IPhone' resource ? e.g. SELECT ?predicate ?object WHERE { ?predicate ?object . } returns : Is this pointing to the wrong resource ? uGeorgi Kobilarov wrote: uHey Rob, > > It uses a pre-generated Lucene index. Sparql queries are still too slow for such a service, and there's no way to do the result ranking in sparql. The service is just a very early prototype, no RESTful webservice yet. I've got a newer version with type-based filtering, which I'll publish soon(ish). Cheers, Georgi" "DBpedia returns Turtle but says Content-Type is text/plain" "uHello, I have run into the problem that when I send a SPARQL request with a CONSTRUCT query to DBpedia programmatically (using Rob Vesse's dotNetRDF library; Florian" "dbpedia 3.5.1 busts my importer script" "uI noticed that the \"labels_en\" file has duplicate rows, something that wasn't the case in the last one. I found 157 of these but here's a particularly annoying one: $ bzcat ~/dbpedia_3.5.1/labels_en.nt.bz2 | grep '/resource/SS>' \"SS\"@en . \"SS\"@en . These are 120k lines between those in the log file, so I've got no idea what the etiology of this is. I liked the old alphabetical order: it was very efficient to build a clustered index on with a minimum of I/O. ;-) As it is I'll probably crank up my memory limit, keep a hashtable of the resource URLs I've seen, and expect the index build at the end to take a little more time" "RDF Validator puts Freebase and DBpedia Live to the test" "uPRESS RELEASE Paul Houle, Ontology2 founder, stated that \"we updated Infovore to accept data from DBpedia, and ran a head to head test, in terms of RDF validity, between Freebase and DBpedia Live.\" \"Unlike most scientific results\", he said, \"these results are repeatable, because you can reproduce them yourself with Infovore 1.1. I encourage you to use this tool to put other RDF data sets, large and small, to the test.\" The tool parallelSuperEyeball was run against both the 2013-03-31 Freebase RDF dump and the 2012-04-30 edition of DBpedia Live. Although Freebase asserts roughly 1.2 billion facts, Infovore rejects roughly 200 million useless facts in pre-filtering. Downstream of that we found 944,909,025 valid facts and than 66,781,906 invalid facts, in addition to 5 especially malformed facts. This is a serious regression compared to the 2013-01-27 RDF dump, in which only about 13 million invalid triples were discovered. The main cause of the increase is the introduction of 40 million or so \"triples\" lacking an object connected with the predictate ns:common.topic.notable_for. Previously, the bulk of the invalid triples were incorrectly formatted dates. The rate of invalid triples in Dbpedia Live was found to be orders of magnitude less than Freebase. Only 8,664 invalid facts were found in DBpedia Live, compared to 247,557,030 valid facts. The predominant problem in DBpedia Live turned out to be noncomfortmant IRIs that came in from Wikipedia. This is comparable in magnitude to the number of facts found invalid in the old Freebase quad dump in the process of creating :BaseKB Pro. Just one of the tools included with Infovore, parallelSuperEyeball is an industrial strength RDF validator that uses streaming processing and the Map/Reduce paradigm to attain nearly perfect parallel speedup at many tasks on common four core computers. Infovore 1.1 brings many improvements, including a threefold speedup of parallelSuperEyeball and the new Infovore shell. Please take a look at our github project at and feel free to fork or star it. Note that many infovore data products are also available at Because infovore is memory efficient, it is possible to use it to handle much large data sets than can be kept in a triple store on any given hardware. The main limitation in handling large RDF data sets is running out of disk space, which it can do quickly by avoiding random access I/O. \"We challenge RDF data providers to put their data to the test\", said Paul Houle, \"Today it's an expectation that people and organizations publish only valid XML files, and the publication of superParallelEyeball is a step to a world that speaks valid RDF and that can clean and repair invalid files.\" Ontology2 is a privately held company that develops web sites and data products based on Freebase, DBpedia, and other sources. Contact with questions about Ontology2 products and services. PRESS RELEASE Paul Houle, Ontology2 founder, stated that \"we updated Infovore to accept data from DBpedia, and ran a head to head test, in terms of RDF validity, between Freebase and DBpedia Live.\" \"Unlike most scientific results\", he said, \"these results are repeatable, because you can reproduce them yourself with Infovore 1.1. I encourage you to use this tool to put other RDF data sets, large and small, to the test.\" The tool parallelSuperEyeball was run against both the 2013-03-31 Freebase RDF dump and the 2012-04-30 edition of DBpedia Live. Although Freebase asserts roughly 1.2 billion facts, Infovore rejects roughly 200 million useless facts in pre-filtering. Downstream of that we found 944,909,025 valid facts and than 66,781,906 invalid facts, in addition to 5 especially malformed facts. This is a serious regression compared to the 2013-01-27 RDF dump, in which only about 13 million invalid triples were discovered. The main cause of the increase is the introduction of 40 million or so \"triples\" lacking an object connected with the predictate ns:common.topic.notable_for. Previously, the bulk of the invalid triples were incorrectly formatted dates. The rate of invalid triples in Dbpedia Live was found to be orders of magnitude less than Freebase. Only 8,664 invalid facts were found in DBpedia Live, compared to 247,557,030 valid facts. The predominant problem in DBpedia Live turned out to be noncomfortmant IRIs that came in from Wikipedia. This is comparable in magnitude to the number of facts found invalid in the old Freebase quad dump in the process of creating :BaseKB Pro. Just one of the tools included with Infovore, parallelSuperEyeball is an industrial strength RDF validator that uses streaming processing and the Map/Reduce paradigm to attain nearly perfect parallel speedup at many tasks on common four core computers. Infovore 1.1 brings many improvements, including a threefold speedup of parallelSuperEyeball and the new Infovore shell. Please take a look at our github project at products are also available at handle much large data sets than can be kept in a triple store on any given hardware. The main limitation in handling large RDF data sets is running out of disk space, which it can do quickly by avoiding random access I/O. \"We challenge RDF data providers to put their data to the test\", said Paul Houle, \"Today it's an expectation that people and organizations publish only valid XML files, and the publication of superParallelEyeball is a step to a world that speaks valid RDF and that can clean and repair invalid files.\" Ontology2 is a privately held company that develops web sites and data products based on Freebase, DBpedia, and other sources. Contact with questions about Ontology2 products and services. uI was always eager to know whether it's possible to improve Dbpedia parsers and mappings by comparing the data between Dbpedia and Freebase. Are there any experiments in this direction? uIn DBpedia we tried very hard to improve the overall quality and we are very interested to take a closer look at the invalid triples. Please note that the dump based framework, which was heavily refactored & improved for 3.8 release is not in full sync with the English DBpedia Live. We already have an improved version of the live framework that is available for Dutch only (live.nl.ddbpedia.org) and will be deployed for English once we test it a little more. In that version we already take better care of IRIs because the Dutch language has many non-ASCII characters. Best, Dimitris On Wed, Apr 10, 2013 at 3:01 AM, Paul A. Houle < > wrote:" "Mapping question" "uI am trying to understand why Toyota Prius does not come back from the following sparql query (from SELECT * WHERE { rdf:type } but Honda_Insight does match the corresponding query: SELECT * WHERE { rdf:type } If I go to and click 'Find usages of this Wikipedia template', I can see Toyota_Prius comes up as an example. What am I missing? thanks, Scott uMaybe it didn't have the infobox at the time the dump was generated? I see some infobox changes in Sept2010 You can look at the Wikipedia Dump from Oct2010 to know for sure. Cheers, Pablo On Jun 15, 2011 10:34 PM, \"Scott White\" < > wrote: Maybe it didn't have the infobox at the time the dump was generated? I see some infobox changes in Sept2010 wrote: uThere was a change of the infobox template name [1]. This was recently noticed by the mapping editors [2] but not in time for the extraction of the last release. Furthermore, the infobox name on the Toyota_Prius page was updated to the new name by the time of the extraction. On the page of Honda_Insight, this update was not made in time. The old version of the mapping was able to extract data for the old version of the template (Honda_Insight) but not for the new (Toyota_Prius). That is also why Toyota_Prius does not have many properties from the /ontology/ namespace. Cheers, Max [1] 14 April 2010 [2] 21 May 2011 On Wed, Jun 15, 2011 at 22:32, Scott White < > wrote:" "DBpedia SPARQL endpoint down!" "uChris Bizer wrote: Chris, The Server didn't go down. Remember, we have a monitoring system that notifies key individuals about the state of the DBpedia instance, the last time the instance went down was around: 15.37pm EST (because I asked the guys to change some of the Virtuoso parameters via the instance INI file which required an instance restart): Group : dbpedia Service : dbpedia down Time noticed : Wed Sep 5 20:37:21 2007 (GMT) Secs until next alert : Tim: Please confirm that the monitor script is now tracking the new server (Open Solaris) as opposed to the old (Linux). As I write, the server is live. Please confirm this is so for you. Kingsley uKingsley Idehen wrote: Alerts have been seen concerning dbpedia on oplussol10 before now. So both servers are monitored alike. ~Tim uHi Kingsley and Tim, Yes, the SPARQL endpoint is up again and also fast like hell. I guess your query cache has kicked in. Thanks for fixing this so quick. Cheers Chris uHi Kingsley, Zdravko and Mitko, the DBpedia SPARQL endpoint did a pretty good job in taking the traffic after the annoucement yesterday :-) But now it seam to be down for some reasons. Any ideas why? Cheers Chris uChris Bizer wrote: Chris, We didn't fix anything? As I said, I just had the server parameters updated (basically to reflect what was on the old DBpedia linux server) which lead to a few restarts :-) Kingsley" "Problem with SPARQL query returning 0 results" "uHello, I am new to this list. I am also a DBpedia and SPARQL noob so please excuse what might be a dumb question. A few months ago I played around with DBpedia and tried to construct a query that would return Wikipedia articles around a certain geo location. This is the query that I came up with (to search for German articles with geodata in and around Berlin, between 52-53 deg north and 13-14 deg east): PREFIX geo: PREFIX foaf: PREFIX dbpedia: PREFIX dbpedia2: PREFIX dbo: SELECT ?subject ?label ?lat ?long ?url ?thumbnailurl ?abstract WHERE { ?subject geo:lat ?lat . ?subject geo:long ?long . ?subject rdfs:label ?label . ?subject foaf:page ?url . ?subject dbo:thumbnail ?thumbnailurl . ?subject dbpedia2:abstract ?abstract . FILTER(xsd:float(?lat) >= 52.0 && xsd:float(?lat) <= 53.0 && xsd:float(?long) >= 13.0 && xsd:float(?long) <= 14.0 && lang(?label) = \"de\" && lang(?abstract) = \"de\" ) . } Using when I tried it a few months ago. But when I tried it again recently (over the last month or so), it only returned an empty result set: { head = { link = ( ); vars = ( subject, label, lat, long, url, thumbnailurl, abstract ); }; results = { bindings = ( ); distinct = 0; ordered = 1; }; } Can you spot what I am doing wrong? Has anything changed on the DBpedia side? I appreciate your help. Ole uHello Ole, The property for abstract has been changed since DBPedia Version 3.5: Instead of dbpedia2:abstract you have to query for dbo:abstract now. Then your query works. Because such changes can happen from time to time you can check your queried properties in case of an empty result by yourself: Probably the easiest way is to browse to a resource that you expect in your result, e.g. Berlin Benjamin Von: Ole Begemann [ ] Gesendet: Dienstag, 4. Mai 2010 11:58 An: Betreff: [Dbpedia-discussion] Problem with SPARQL query returning 0 results Hello, I am new to this list. I am also a DBpedia and SPARQL noob so please excuse what might be a dumb question. A few months ago I played around with DBpedia and tried to construct a query that would return Wikipedia articles around a certain geo location. This is the query that I came up with (to search for German articles with geodata in and around Berlin, between 52-53 deg north and 13-14 deg east): PREFIX geo: PREFIX foaf: PREFIX dbpedia: PREFIX dbpedia2: PREFIX dbo: SELECT ?subject ?label ?lat ?long ?url ?thumbnailurl ?abstract WHERE { ?subject geo:lat ?lat . ?subject geo:long ?long . ?subject rdfs:label ?label . ?subject foaf:page ?url . ?subject dbo:thumbnail ?thumbnailurl . ?subject dbpedia2:abstract ?abstract . FILTER(xsd:float(?lat) >= 52.0 && xsd:float(?lat) <= 53.0 && xsd:float(?long) >= 13.0 && xsd:float(?long) <= 14.0 && lang(?label) = \"de\" && lang(?abstract) = \"de\" ) . } Using when I tried it a few months ago. But when I tried it again recently (over the last month or so), it only returned an empty result set: { head = { link = ( ); vars = ( subject, label, lat, long, url, thumbnailurl, abstract ); }; results = { bindings = ( ); distinct = 0; ordered = 1; }; } Can you spot what I am doing wrong? Has anything changed on the DBpedia side? I appreciate your help. Ole u2010/5/4 Benjamin Großmann < >: Do properties get deprecated for some period of time before they go away? Are there tools to help application writers find their uses of deprecated properties? More generally, what strategies can application writers use to keep their applications from breaking when new versions are released? Tom uTom Morris wrote: It's a smaller complaint, but I got a little mad that the names of the files of the dbpedia dump changed. I've got scripts that download them all and import them automatically, and they all broke because of the file name change. (Some of them broke because of other changes, too) Generally I'm a distrusting sort of fellow (i-n-t-J), so dbpedia assertions go through a \"closed world\" filter before they get into my system. Something funky turns up with every new dbpedia dump, often harmless or easy to adapt to. Changing property names sticks out like a sore thumb: these kind of problems get handled before they result in broken joins and failing queries. Overall, I think the the public SPARQL interface is good for (i) little \"touch up\" projects against your own database and (ii) people who write conference papers rather than build working systems. My web sites get enough visitors that I get complaints about things that are wrong, even about little obscure things: if I'm making a list of \"important\" cities, I'll need NYC, London and Tokyo to make the list. This isn't just a problem with dbpedia, it's pretty universal in the \"generic database\" space. Freebase has pretty much the same issues, and I gave up on a plan to extract a few thousand \"facts\" from the OSM API because, as usual, (i) the API is poorly documented, (ii) the data format is poorly documented, and (iii) I couldn't get any help from the mailing list. So I'm downloading the 9GB planet.OSM file and figure that my \"closed world\" tools will make short work of the problem. uTom Morris wrote: u2010/5/4 Benjamin Großmann < >: Thanks for the help, Benjamin. Ole uHi, we have documented the properties used in the DBpedia data sets here: The property definitions can be either found in the DBpedia ontology or in other ontologies that are used. You can have a look at the previews provided for each data set on the Downloads page as well: Cheers, Anja On 04.05.2010 13:04, Benjamin Großmann wrote:" "Q: dbprop:name attribute meaning" "uHi All, When looking at We can guess from context it is relation but on the other hand name is telling us other meaning . I'm wonder what is supposed to express this property, why it is both: literal and object reference? Best Regards, Mitko uHello, Mitko Iliev wrote: Looking at the source of I see several templates containing the name attribute: {{CFB Yearly Record Subhead | name = [[Pittsburgh Panthers football|Pittsburgh Panthers]] | conf = Division I-A Independent | startyear = 1989 | endyear = 1990 }} {{CFB Yearly Record Entry | championship = | year = 1989 | name = Pittsburgh For each infobox containing the name attribute (some of them Wiki links and some not), a name property is extracted. The infobox coherence proposal we are currently discussing in Wikipedia (see my previous mail on the list) can solve those problems (in that case another problem is that \"name\" does not stand for the name of the person, but rather for a team in which the person played). It is not clear yet whether and when the issue will be fixed. Kind regards, Jens" "data sets for language versions" "uHi everyone, while working with the DBpedia 2014 Download, I was wondering why several types of data sets are only available for some language versions. For example the instance_types file is available for 28 language versions and the persondata file seems to be available only for the english and german wikipedia. Does anyone know the reason for this? Thanks a lot, Christoph uHi Christoph, 28 language versions are those that at the moment of the DBpedia 2014 extraction had non-empty mapping chapters, that's why all data sets that use ontology properties and classes are available for 28 languages. For some other extractors the information might be available in less languages in Wikipedia itself or/and extraction is language-specific. Cheers, Volha On 3/2/2015 1:46 PM, Christoph Hube wrote: uHi Volha, thanks for your help! Do you know why the mapping chapters for the other language versions are empty? The information for instance types should be available for more than 28 language versions on Wikipedia. best regards, Christoph Am 02.03.2015 um 14:32 schrieb Volha Bryl:" "Incomplete resource descriptions" "uHi, I try to access resource descriptions on DBpedia via HTTP, but often the data returned is, although valid and correct, incomplete and therefore not really useful in an application. For instance, accessing $ curl -I -H \"Accept: application/rdf+xml\" redirects to When retrieving this, around 10,000 RDF triples are returned, but none of them has subject uHello, As far as I know, the XML format returns the URI as object and subject as well. However it starts with the objects, therefore if you reach the limit (I thought it was 2000) before getting to the subject part, then you only get it back as object. If you retrieve the data as Ntriples, Atom or JSOD (add the extension at the end of the data URI), then it will only return the URI as subject. I have no idea why it works like this (and why there is no documentation whatsoever about it) but this could be a workaround for you Zoltán On 2011.12.06. 9:40, Bernhard Schandl wrote:" "Stale data in DBpedia Live?" "uHi, I've read there is continuous synchronization between DBpedia Live and Wikipedia, but I'm finding issues with this. I've tried searching for other posts with possible clues, but haven't discovered any. 1. So if I look in Wikipedia at a baseball player who was traded this past January, I find this: Wikipedia ( Current team: Washington Nationals (2016–present) First line of text: Benjamin Daniel Revere (born May 3, 1988) is an American professional baseball outfielder for the Washington Nationals of Major League Baseball (MLB). 2. Using the DBpedia Live Sparql endpoint I find the team to be ok, but the abstract and comment to be pretty stale, he was traded away from the Phillies on July 31, 2015: DBpedia Live: ( dbo:team:Washington Nationals dbo:abstract: Ben Daniel Revere (born May 3, 1988) is an American professional baseball player for the Philadelphia Phillies of Major League Baseball (MLB). rdfs:comment: Benjamin Daniel Revere (born May 3, 1988) is an American professional baseball outfielder with the Philadelphia Phillies of Major League Baseball (MLB). 3. Just to complete what I found, this is what the DBpedia page has for him. The abstract and comment are newer than in DBpedia Live! Dbpedia ( dbo:team: Toronto_Blue_Jays dbo:abstract: Benjamin Daniel Revere (born May 3, 1988) is an American professional baseball outfielder for the Toronto Blue Jays of Major League Baseball (MLB). rdfs:comment: Benjamin Daniel Revere (born May 3, 1988) is an American professional baseball outfielder for the Toronto Blue Jays of Major League Baseball (MLB). Can someone please explain why this is so? Thanks for your help! Emery uHello Emery, DBpedia Live currently does not receive updates from Wikipedia, because we have to switch to a newly introduced notification system [1], which takes longer than expected. That is why, there are no changes processed since January. Raphael Boyer is working on that and we will hopefully soon start the extraction again. Best regards. [1] uThanks for the information. Are there any projected dates when that'll be up and running? uHi, Are there any status updates as to when DBpedia Live might once again be synchronized with Wikipedia? Thanks! uHi Emery, DBpedia Live has been restarted recently. As you can see from the changesets it is effectively producing output since Hopefully, it will be stable for some time again. Let us know, in case you experience any problems or unexpected behaviour. Best Magnus uActually we re-enabled Live on Sunday and looks like the updated framework works quite well (thanks to HPI for the contributions) It will take a few days until all pages are re-updated though (we are now ~15%) On Tue, Jul 12, 2016 at 6:16 PM, Emery Mersich < > wrote: uWell, there is a discrepancy. I use this url: This SPARQL query: SELECT ?player ?label ?teamLabel ?abstract ?description_en WHERE { ?player rdfs:label \"Ben Revere\"@en . ?player rdfs:label ?label . ?player dbo:abstract ?abstract . ?player dbo:team ?team . ?team rdfs:label ?teamLabel . OPTIONAL { ?player rdfs:comment ?description_en . FILTER (LANG(?description_en) = 'en')} . } It gives the team as the \"Washington Nationals\"@en, which is correct. The abstract still says: \"Ben Daniel Revere (born May 3, 1988) is an American professional baseball player for the Philadelphia Phillies\" Wikipedia says: \"Benjamin Daniel Revere (born May 3, 1988) is an American professional baseball outfielder for the Washington Nationals \" uThat’s excellent news! I’ll keep checking it and look forward to using it. Thanks very much! uActually abstracts were problematic with the previous framework and all abstracts were copied from the previous static release when we see that the new live extraction works well we will to re-enable the abstract extraction and keep the abstracts in sync as well On Wed, Jul 13, 2016 at 4:58 PM, Emery Mersich < > wrote: uDoes the same apply to descriptions? Those are also old in the results of my query. Any thoughts on when abstracts will get turned on? uHi Emery, yes, descriptions are are also part of abstracts Looks like it is safe to re-enable now, everything runs smoothly however, since the existing abstracts are not part of the live extraction we have 2 options 1) delete all existing abstracts and let them re-populate as live runs - we will have no abstracts for some articles for some time (1-2 months) 2) leave the existing abstracts and for some time we will have 2 entries for articles (the live + the manually loaded) What would be your preferred option? On Mon, Jul 18, 2016 at 7:09 PM, Emery Mersich < > wrote: uHi, Suggestion 1 is better for us. If you delete all the abstracts we can just grab and use the abstracts that are populated, and ignore the others. Thanks!" "it.dbpedia.org DNS" "uHi all, we are working hard to prepare the Italian Chapter of DBpedia and until now we patch our /etc/hosts in order to test and work on the site. We really would need to set the dbpedia DNS to our site, which is 194.116.72.119 Please add a DNS record to our site, the content is improving and english content will come soon too, we would like to release on April the 19th during a Trento University Event. As now we are mimicking el.dbpedia.org pages, but that would change in next few days. uMarco, Richard has kindly done this for us. Now we just need to wait for the DNS juice to flow through the Internet's veins. Cheers, Pablo On Thu, Apr 12, 2012 at 1:16 PM, Marco Amadori < >wrote: u2012/4/12 Pablo Mendes < >: Great NEWS! Many thanks Richard!" "Examples of using Dbpedia" "uHi all, With the great job made with dbpedia 2.0 (Congratulation and thanks to all the team), it's a pleasure to play with dbpedia's data and I developed some experiments to show the possibilities of Dbpedia and web semantic technologies ( It's in french, but I thought these experiments could interrest you. For example : - a mashup with Google maps ( e.g. : european capitals ( or UNESCO world heritages ( - List of persons thanks to Birth city and exhibit ( e.g. : persons born in Amsterdam ( I hope these modest examples can help your wonderful project. If you have some questions, don't hesitate, I will try to answer. Best wishes Gautier Poupeau uHi Gautier, very nice demos :-) I have added links to them to the Dbpedia page, see If you would ask me for improvements and as my French is not too good, I clearly would not mind a \"Change Language\" button that supports English, German and Spanish ;-) Keep on the great work! Cheers Chris uChris Bizer a écrit : Thank you Chris I would like to say : \"it's done\", but I didn't find time to do it. I hope in next weeks with new examples. Cheers Gautier" "OntologyClass addition and dbpedia live" "uHi all, I have added yesterday a new class OntologyClass:Noble [1] to start mapping the Infobox_noble [2]. I can see the class here: but not here: What am I doing wrong? Also I have started adding some mappings for: - - - - but I am not sure they are correct even if they show to be valid. If I query live.dbpedia.org SPARQL endpoint I am not really getting what I was expecting. Is there any problem with the endpoint freshness or are there problems with my mappings that prevent correct generation of data? Thank you very much for your help, Andrea [1] [2] Mapping_en:Infobox_noble uHi Andrea, On 12/21/2012 12:01 PM, Andrea Di Menna wrote: I can see your ontology class at that address. Actually the framework had stuck and I had to restart it, and it should work now. I can see some articles, which use the mappings you mentioned, are processed already e.g. http://dbpedia.org/resource/The_Split_of_Life, which uses http://mappings.dbpedia.org/index.php/Mapping_en:Infobox_artwork. So you should be able to see the changes as well." "Escaping of titles" "uHello, I am trying to construct dbpedia URIs from Wikipedia page titles, but it dbpedia seems to be escaping more characters than are required by RFC3986 - for example ',' is encoded to %2C. What is doing the escaping? Is there a list of characters that are escaped in titles? Thanks, nick. This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. uHi, we try to be as close as possible to the Wikipedia title encoding scheme. Unfortunately, we mistakingly encoded the \",\" character, thanks for the hint. This behavior will be fixed in the next Release. The current behavior is as follows: - The alphanumeric characters \"a\" through \"z\", \"A\" through \"Z\" and \"0\" through \"9\" remain the same. - The special characters \".\", \"-\", \"*\", \"/\", \":\" and \"_\" remain the same. - The space character \" \" is converted into a plus sign \"_\". - All other characters are unsafe and are first converted into one or more bytes using UTF-8 encoding. Then each byte is represented by the 3-character string \"%xy\", where xy is the two-digit hexadecimal representation of the byte. - Furthermore, multiple underscores are collapsed into one. Cheers, Robert On Thu, Jul 29, 2010 at 3:37 PM, Nicholas Humfrey < > wrote: uHello, Another problematic character is '&' which is valid in the path component of the URL. Example: Thanks, nick. On 03/08/2010 17:17, \"Robert Isele\" < > wrote:" "How to unsubscribe" "uHello everyone, Could any one let me know how to unsubscribe from the dbpedia discussion list ? Thanks in advance. - Ankur Padia. Hello everyone, Could any one let me know how to unsubscribe from the dbpedia discussion list ? Thanks in advance. - Ankur Padia. uOn 9/11/13 6:39 pm, Ankur Padia wrote: I would advise following the link that is at the bottom of every single message to this list and see if there's anything there that might help you. Jeen" "Changes in the string format" "uGreetings from San Diego- I note that DBP seems to have changed its approach to representing strings from say \"Hassan Rouhani\"@en to \"Hassan Rouhani\"^^xsd:string Is there some alternative to rewriting all my queries? If not, could I trouble someone for a pointer to some resource that describes this new regime, with some discussion of how to query for non-English string values? Thanks, Eric Scott uHello Scott, For more information on this change please read [1]. If this question is only about DBpedia Live, there is an issue in recent development that we are about to fix [2] Best, Dimtiris [1] [2] On Fri, Oct 3, 2014 at 4:06 PM, Eric Scott < > wrote:" "Uploading dbpedia in Sesame 1.2.6 nativerepositories" "uHi Vanessa, There'll be a new DBpedia release within the next days, which will hopefully resolve that problem. Best, Georgi" "2nd call - Challenge: Doing Good by Linking Entities" "u* [Apologies for cross-posting] We are accepting submissions! Deadline: April 10th Need ideas? Have ideas? Prizes: iPad 2 Doing Good by Linking Entities, Developer's Challenge 2nd International Workshop on Web of Linked Entities (WoLE2013) In conjunction with the 22nd International World Wide Web Conference (WWW2013), Rio de Janeiro, 13 May 20132 The WoLE2013 Challenge is offering prizes for the best applications that show the impact of the Web of Linked Entities on problems/solutions affecting the local community, e.g. detecting corruption, tracking criminality, facilitating access to education or health services, helping to search for the cure for neglected diseases, promoting citizen participation on the government, improving tourism-related services, etc. Although we have observed an explosion of the number of structured data sources shared on the “Web of Data”, the majority of the available Web content is still unstructured or semi-structured" "Gsoc Project" "uHello, I am Ishaan Batta, pursuing B.Tech in Computer Sciences and Engineering Dept. at IIT Delhi, I am lookin forward to work on the following project under Gsoc 2013. Efficient graph-based disambiguation and general performance improvements I have done exclusive courses on Data Structures and Graph Algorithms and I am really passionate about coding. I am really enthusiastic about the project and currently exploringabout the same. I'd be glad if someone could guide me on how to proceed on the same. Looking forward to work with you on this Project. uHello & welcome to DBpedia+Spotlight GSoC ! All GSoC related talk takes place in our GSoC mailing list. So, please read the guide we prepared for candidate students and come back with more questions :) Best, Dimitris On Sun, Apr 21, 2013 at 2:04 PM, Ishaan Batta < > wrote:" "The difference between infobox downloads" "uHello, this is my first post to this list, I'm glad to have found it. My question is this - on the downloads page there is an option for \"Infoboxes\" and \"Ontology Infoboxes\". What is the difference between the two? I've noticed a few differences - \"Ontology Infoboxes\" appears to have substantially more data in it (or am I off here?) and the predicates between the two different sets of triples appear to be different: Infoboxes: IO: What's the difference between the \"property\" and \"ontology\" in the predicates? Thanks in advance for any light you can shed on this. Ander Murane uHello, Ander Murane schrieb: Both data sets are the result of extracting information from Wikipedia infoboxes. The \"Infoboxes\" data set extracts the information using a generic approach, i.e. each attribute $foo in a Wikipedia infobox is translated to a property the \"Ontology Infoboxes\" data set uses a mapping based approach, i.e. there is a manually maintained set of mapping rules. For instance the attribute $foo could be mapped to extracts more data and the latter one extracts cleaner data. In particular different spellings of an attribute are often mapped to the same property. Kind regards, Jens" "SPARQL DBPedia.org Query" "uDear all, My question might be on the silly side, but then again answering it will be easy, so here goes: I'm using of Airports in the United States (ICAO code starting with \"K\") and their elevation and IATA code. Query included below. Works fine. SELECT ?resource ?hasValue ?isValueOf ?elevation ?icao ?iata WHERE { { ?resource ?hasValue } UNION { ?isValueOf ?resource }. { ?isValueOf ?elevation. } { ?isValueOf ?icao. } OPTIONAL { ?isValueOf ?iata. } FILTER regex(?icao, \"^K\") } The problem is that DBPedia returns every airport twice, once with elevation as a decimal and once with elevation as a more exact decimal. Example below: resource hasValue isValueOf elevation icao iata LAX IAD http://dbpedia.org/resource/Washington_Dulles_International_Airport 95.4024 KIAD IAD http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/resource/Van_Nuys_Airport 244.4 KVNY VNY http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/resource/Van_Nuys_Airport 244.4496 KVNY VNY The question is how do I get a list with the exact decimals only? FILTER with regex on ?elevation in the decimal point doesn't work Any ideas? Thanks in advance, Timo Kouwenhoven http://www.skybrary.aero (aviation safety knowledge) Dear all, My question might be on the silly side, but then again answering it will be easy, so here goes: I'm using http://dbpedia.org/snorql to query dbpedia in order to get a list of Airports in the United States (ICAO code starting with 'K') and their elevation and IATA code. Query included below. Works fine. SELECT ?resource ?hasValue ?isValueOf ?elevation ?icao ?iata WHERE {   { < http://dbpedia.org/ontology/Elevation > ?resource ?hasValue }   UNION   { ?isValueOf ?resource < http://dbpedia.org/ontology/Airport > }.   { ?isValueOf < http://dbpedia.org/ontology/elevation > ?elevation. }   { ?isValueOf < http://dbpedia.org/ontology/icaoLocationIdentifier > ?icao. } OPTIONAL { ?isValueOf < http://dbpedia.org/ontology/iataLocationIdentifier > ?iata. } FILTER regex(?icao, '^K') } The problem is that DBPedia returns every airport twice, once with elevation as a decimal and once with elevation as a more exact decimal. Example below: resource hasValue isValueOf elevation icao iata http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/resource/Los_Angeles_International_Airport 38 KLAX LAX http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/resource/Los_Angeles_International_Airport 38.4048 KLAX LAX http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/resource/Washington_Dulles_International_Airport 95 KIAD IAD http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/resource/Washington_Dulles_International_Airport 95.4024 KIAD IAD http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/resource/Van_Nuys_Airport 244.4 KVNY VNY http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/resource/Van_Nuys_Airport 244.4496 KVNY VNY The question is how do I get a list with the exact decimals only? FILTER with regex on ?elevation in the decimal point doesn't workAny ideas? Thanks in advance, Timo Kouwenhoven http://www.skybrary.aero (aviation safety knowledge) uOn 8/26/10 11:13 AM, Timo Kouwenhoven wrote: uOn Aug 26, 2010, at 12:08 PM, Kingsley Idehen wrote: I think you misunderstood Timo's goal, Kingsley. I think the goal is to get the most-precise of the available values. The big challenge I see there is uOn 8/27/10 11:09 AM, Ted Thibodeau Jr wrote: Ted, Or this :-) SELECT DISTINCT ?resource ?hasValue ?isValueOf bif:round(bif:min(?elevation)) as ?minelev bif:round(bif:max(?elevation)) as ?maxelev ?icao ?iata WHERE { { ?resource ?hasValue } UNION { ?isValueOf ?resource } { ?isValueOf ?elevation } { ?isValueOf ?icao } OPTIONAL { ?isValueOf ?iata } FILTER regex(?icao, \"^K\") } Links: 1. cknyIe" "Multilingual properties in the new DBpedia release" "uHi all, There is some weird things in the new release of DBPedia regarding its multi-lingual aspect and related properties. If I look at properties such as: By translating these _percent_xx to %xx, we got an hebrew property name. Can these named with _percent_ be fixed ? However, my main concern is that these properties re-define already existing ones (in that case the death date) and are not mapped with then. For instance, to find the death date in hebrew, I must do SELECT ?date WHERE { ?date } Instead of using a common property and applying a lang filter. That imho strongly limits the use of DBpedia as a multilingual knowledge base. I guess multi-lingual mapping is not an easy task in the extraction process, but do you plan to focus on that for the next release ? Thanks, Alex. uOn 5 Nov 2009, at 13:57, Alexandre Passant wrote: [] (I meant resource, not page) uHi Alexandre, DBpedia 3.4 extracts datasets from Wikipedia editions in 91 different languages. As with previous releases, we provide all this data for download from the DBpedia download page, but only load a subset of the data into the DBpedia SPARQL endpoint which also creates the DBpedia HTML interface. There has been a misunderstanding between someone at my team and someone at Openlink and therefore all data from all languages has been loaded into the public endpoint. The strangely looking URIs are a result of this. We currently work on getting only usual subset loaded into the endpoint. Therefore the strangely looking property URIs will quickly disappear and the RDF links into other LOD datasets that are currently not loaded will appear again. Sorry for the confusion! Doing Wikipedia cross-language data fusion is indeed tricky. We did some experiments with this, see paper below, but don't think we will be able to provide this for the complete dataset anytime soon. Cheers, Chris Eugenio Tacchini, Andreas Schultz, Christian Bizer: Experiments with Wikipedia Cross-Language Data Fusion. 5th Workshop on Scripting and Development for the Semantic Web (SFSW2009), Crete, June 2009. usion.pdf uAlexandre Passant wrote: uOn 5 Nov 2009, at 15:32, Kingsley Idehen wrote: [] uHi, On 5 Nov 2009, at 14:54, Chris Bizer wrote: No pb - everything has been fixed quickly ! I was actually not aware that only a subset is generally loaded in the public interface, interesting. Is that mainly for performance issues or to avoid these confusing properties to appear ? Thanks for the pointer ! Alex. uAlexandre Passant wrote: uHi all Another bug (or feature?) of the current release re. languages. description OK. But in the lists of wilinks, a lot of URI have French fragment identifiers such as which leads nowhere, where one should have or maybe in the context of the French navigation Bernard 2009/11/5 Alexandre Passant < > uBernard Vatant wrote: Ah! We'll look into it. Good time for finessing etc Kingsley" "Dbpedia Spotlight x Dbpedia Lookup" "uHello! What's the best tool - Dbpedia Spotlight or Lookup? Are there any specific cases to use one or another? Luciane. Hello! What's the best tool - Dbpedia Spotlight or Lookup? Are there any specific cases to use one or another? Luciane." "Jena with eclipse JEE" "uWhen run jsp file that call java class contain sparql query retrieve data from DBpedia  an error occured on runtime like this : note : in my project I was add Jena library in buildpath of project  type Exception report message org.apache.jasper.JasperException: java.lang.reflect.InvocationTargetException description The server encountered an internal error that prevented it from fulfilling this request. exception org.apache.jasper.JasperException: org.apache.jasper.JasperException: java.lang.reflect.InvocationTargetException org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:549) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:455) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) root cause org.apache.jasper.JasperException: java.lang.reflect.InvocationTargetException org.apache.jasper.runtime.JspRuntimeLibrary.internalIntrospecthelper(JspRuntimeLibrary.java:360) org.apache.jasper.runtime.JspRuntimeLibrary.introspecthelper(JspRuntimeLibrary.java:306) org.apache.jsp.NewFile_jsp._jspService(NewFile_jsp.java:78) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) root cause java.lang.reflect.InvocationTargetException sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) java.lang.reflect.Method.invoke(Unknown Source) org.apache.jasper.runtime.JspRuntimeLibrary.internalIntrospecthelper(JspRuntimeLibrary.java:354) org.apache.jasper.runtime.JspRuntimeLibrary.introspecthelper(JspRuntimeLibrary.java:306) org.apache.jsp.NewFile_jsp._jspService(NewFile_jsp.java:78) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) root cause java.lang.NoClassDefFoundError: com/hp/hpl/jena/query/QueryFactory com.omiama.me.setTopic(me.java:23) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) java.lang.reflect.Method.invoke(Unknown Source) org.apache.jasper.runtime.JspRuntimeLibrary.internalIntrospecthelper(JspRuntimeLibrary.java:354) org.apache.jasper.runtime.JspRuntimeLibrary.introspecthelper(JspRuntimeLibrary.java:306) org.apache.jsp.NewFile_jsp._jspService(NewFile_jsp.java:78) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) root cause java.lang.ClassNotFoundException: com.hp.hpl.jena.query.QueryFactory org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1714) org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1559) com.omiama.me.setTopic(me.java:23) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) java.lang.reflect.Method.invoke(Unknown Source) org.apache.jasper.runtime.JspRuntimeLibrary.internalIntrospecthelper(JspRuntimeLibrary.java:354) org.apache.jasper.runtime.JspRuntimeLibrary.introspecthelper(JspRuntimeLibrary.java:306) org.apache.jsp.NewFile_jsp._jspService(NewFile_jsp.java:78) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) the jsp code is : <%@ page language=\"java\" contentType=\"text/html; charset=windows-1256\"     pageEncoding=\"windows-1256\"%> DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \" package com.omiama; import com.hp.hpl.jena.query.ARQ; import com.hp.hpl.jena.query.Query; import com.hp.hpl.jena.query.QueryExecution; import com.hp.hpl.jena.query.QueryExecutionFactory; import com.hp.hpl.jena.query.QueryFactory; import com.hp.hpl.jena.query.QuerySolution; import com.hp.hpl.jena.query.ResultSet; public class me { private String Topic=\"\"; public void setTopic(String topic) { Topic=topic; queryExternalSources(); }     public void queryExternalSources() {         //Defining SPARQL Query. This query lists, in all languages available, the         //abstract entries on Wikipedia/DBpedia for the planet Mars.         String sparqlQueryString2 = \" SELECT ?abstract \" +                                     \" WHERE {{ \" +                                          \" \" +                                          \" \" +                                          \"          ?abstract }}\";         Query query = QueryFactory.create(sparqlQueryString2);         ARQ.getContext().setTrue(ARQ.useSAX);        //Executing SPARQL Query and pointing to the DBpedia SPARQL Endpoint          QueryExecution qexec = QueryExecutionFactory.sparqlService(\"        //Retrieving the SPARQL Query results         ResultSet results = qexec.execSelect();        //Iterating over the SPARQL Query results         while (results.hasNext()) {             QuerySolution soln = results.nextSolution();             //Printing DBpedia entries' abstract.             System.out.println(soln.get(\"?abstract\"));                                                         }         qexec.close();     } } When run jsp file that call java class contain sparql query retrieve data from DBpedia an error occured on runtime like this : note : in my project I was add Jena library in buildpath of project type Exception report message org.apache.jasper.JasperException: java.lang.reflect.InvocationTargetException description The server encountered an internal error that prevented it from fulfilling this request. exception org.apache.jasper.JasperException: org.apache.jasper.JasperException: java.lang.reflect.InvocationTargetException org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:549) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:455) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) root cause org.apache.jasper.JasperException: java.lang.reflect.InvocationTargetException org.apache.jasper.runtime.JspRuntimeLibrary.internalIntrospecthelper(JspRuntimeLibrary.java:360) org.apache.jasper.runtime.JspRuntimeLibrary.introspecthelper(JspRuntimeLibrary.java:306) org.apache.jsp.NewFile_jsp._jspService(NewFile_jsp.java:78) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) root cause java.lang.reflect.InvocationTargetException sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) java.lang.reflect.Method.invoke(Unknown Source) org.apache.jasper.runtime.JspRuntimeLibrary.internalIntrospecthelper(JspRuntimeLibrary.java:354) org.apache.jasper.runtime.JspRuntimeLibrary.introspecthelper(JspRuntimeLibrary.java:306) org.apache.jsp.NewFile_jsp._jspService(NewFile_jsp.java:78) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) root cause java.lang.NoClassDefFoundError: com/hp/hpl/jena/query/QueryFactory com.omiama.me.setTopic(me.java:23) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) java.lang.reflect.Method.invoke(Unknown Source) org.apache.jasper.runtime.JspRuntimeLibrary.internalIntrospecthelper(JspRuntimeLibrary.java:354) org.apache.jasper.runtime.JspRuntimeLibrary.introspecthelper(JspRuntimeLibrary.java:306) org.apache.jsp.NewFile_jsp._jspService(NewFile_jsp.java:78) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) root cause java.lang.ClassNotFoundException: com.hp.hpl.jena.query.QueryFactory org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1714) org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1559) com.omiama.me.setTopic(me.java:23) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) java.lang.reflect.Method.invoke(Unknown Source) org.apache.jasper.runtime.JspRuntimeLibrary.internalIntrospecthelper(JspRuntimeLibrary.java:354) org.apache.jasper.runtime.JspRuntimeLibrary.introspecthelper(JspRuntimeLibrary.java:306) org.apache.jsp.NewFile_jsp._jspService(NewFile_jsp.java:78) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) javax.servlet.http.HttpServlet.service(HttpServlet.java:728) the jsp code is : <%@ page language=\"java\" contentType=\"text/html; charset=windows-1256\" pageEncoding=\"windows-1256\"%> 2007 > 2007-06 > 2007-06-28 (Latest) (Search) 00:13:13 kidehen? 00:13:19 hi 00:13:30 You are running he dbpedia server? 00:13:34 yes 00:13:57 I find 00:14:11 Bad Gateway! 00:14:11 The proxy server received an invalid response from an upstream server. 00:14:11 The proxy server could not handle the request GET /resource/Berlin. 00:14:12 Reason: Error reading from remote server 00:14:12 If you think this is a server error, please contact the webmaster. 00:14:12 Error 502 00:14:16 hmmm 00:14:30 try 00:14:36 quick way to split issues 00:14:44 Hi and Good Evening, How are you, and stuff. 00:14:54 dbpedia.org has a proxy to the URI I just gave 00:15:10 when it goes down it produces unintended dislocation 00:15:13 :-( 00:15:47 timbl: doing well (bar the heatwave) :-) 00:16:12 timbl: busy as per usual :-) 00:16:59 MIT was fine but hioem is sweltering 00:17:03 home 00:17:08 ditto 00:17:21 Hmmm I get teh isparql page. 00:17:31 and driving most of the day shuttling kids between summer camps etc:-) 00:17:58 timbl: this is an interesting problem you're hitting :-) 00:18:15 timbl: the Linked Data URIs are routed via the proxy in Germany :-) 00:18:24 so the URIs are the currency etc00:18:39 they go offline the URIs become not so cool :-( 00:18:39 The dbpedia.org URIs are in Germany? 00:19:10 they have a proxy service called \"puppy\" that sits in front of Virtuoso over here :-) 00:19:29 its an interesting phenomenon i.e. Linked Data Deployment :-) 00:20:23 so you have SPARQL Servers e.g. Virtuoso which hosts the DBpedia Data Sets and then a variety of Linked Data Deployment services e.g. Puppy in Germany (Chris & Richard's effort) 00:20:43 So you ish out data where each of the URIs are in germany. 00:20:44 timbl: here will be something similar closer to home re. Zitgist :-) 00:20:50 s/ish/dish/ 00:20:56 yes 00:21:02 we are the RDF Data Server 00:21:11 What about the Amazon book stuff? 00:21:20 * kidehen hosted in Germany 00:21:30 on the D2RQ server 00:23:03 soon there would be Zitgist URIs for the same data and the only difference will be depth, breadth, and linkage of the URIs 00:23:33 so dereferencing Berlin from Zigist will take you to the same data albeit via Zitgist URIs 00:24:32 so for now, there would be a common Virtuoso based Data Server for the DBpedia RDF and lots of Services that expose URIs for the data 00:25:12 But now, you think the dbpedia puppy is broken? 00:25:21 yes 00:25:31 or some gateway to Germany etc00:25:41 this happens from time to time 00:26:03 Your isparql knows about 00:26:03 they do have an option to host puppy in the U.S. etc but that's really their call :-) 00:26:11 yes 00:26:24 cos thats in the store 00:26:52 you could so a CONSTRUCT against it for consumption in Tabulator for instance 00:27:08 or DESCRIBE 00:27:24 Yes, seems down $ curl -I 00:27:24 HTTP/1.1 502 Bad Gateway 00:27:35 let me check quickly 00:27:37 Who maintains it over there? 00:28:46 it works for me, but you can CURL it since they own that route 00:28:51 you can SPARQL it though 00:29:03 (is SPARQL really case insensitive for keywords?) 00:29:36 hmmmI think so, but not 100% 00:29:42 sure 00:30:05 let me give you a SPARQL Protocol URL re. this URI 00:30:14 i.e. 00:30:56 I need to tinyurl it 00:31:48 00:31:48 Can I just say DESCRIBE in isparql? 00:31:52 A: 00:32:09 yes, but let me also test this and then tinyurl 00:32:16 What was the query? 00:32:31 PREFIX rdf: 00:32:31 SELECT DISTINCT * 00:32:31 FROM 00:32:31 WHERE { 00:32:31 ?s ?p 00:32:32 } 00:34:20 Do I have to put the FROM clause? 00:35:24 describe 00:35:25 from 00:35:51 since we are a Quad Store, the Graph needs to be Identified 00:36:11 This works, though: SELECT DISTINCT * 00:36:12 WHERE { 00:36:12 ?s ?p 00:36:12 } 00:36:32 now this can be aligned to a default data space / storage internally (meaning you can leave it out etc) 00:36:39 yes 00:36:49 because the Graph is in default storage 00:37:03 It doesn't jjust query the whole store , then. OK 00:37:14 correct 00:37:30 one source of our performance :-) 00:38:22 uHi Chris, Slightly shifting topic, what's the timescale with data cleansing and launch of the new data set? We have some possibly interesting results from our experiments with DBpedia films which may be useful to feed back. Let me know if you want more info. Cheers, Tom. On 28/06/07, Chris Bizer < > wrote: uHi Tom, Georgi and Pavel are currently working on the new release. I hope that they will be able to publish it in about 2-3 weeks. Sorry, that it takes so long. Yes, please any info and additional bug reports welcome. Cheers, Chris. On 28/06/07, Chris Bizer < > wrote: uHey, On 28/06/07, Chris Bizer < > wrote: No worries. We're thankful for what we've got :) Cool. We'll summarise what we've found and send it across. Tom. uHi Chris, I've been working with the DBpedia dataset to try and extract Films and assorted metadata for a couple of days now; I've a few observations of things which may or may not have been brought to Georgi or Pavel's attention already. Apologies if I'm repeating what you already know! Notes on DBpedia data (generally found through examining of Film infoboxes and classes, but may hold true on a wider scale!): * The assertion of rdf:type can be somewhat hit-and-miss. There are resources described with type yago:Film which also have type, for example, yago:Train. Furthermore, there are Things for which the assertion rdf:type yago:Film may be debatable - are the categories around Films (e.g. dbpedia:Category:Film, dbpedia:Category:Films_from_1999 etc.) really of type Film? There are around 30,000 Things to be found of type Film, only around 12,000 (according to the stats on dbpedia.org/docs and to those I've found) of which are actually films. In general, those which are not films or categories about films are cinemas, tv series, etc. These do not typically have a dbpedia:director or dbpedia:starring predicate, while real films do (this was the most accurate metric I could devise to find the difference, without trying to read in the entirety of the dataset at once and performing and smarter inference! I'd be interested to hear how you came about the number on the DBpedia website!) * The parser from Wikipedia to RDF does a slightly hit-and-miss job of parsing lists of, for example, those starring in a film. Some of these are done perfectly, with URIs for the star (particularly in some more mainstream, and thus better maintained, film entries). Others, however, can be found in the wiki format [[Person One]][[Person Two]][[Person Three]], Person OnePerson TwoPerson Three, or (even harder to get around!) Person One Person Two Person Three, or some combination of these; \"[[Richard Madeley]],(Unaired pilot);[[Jeremy Beadle]],([[1990]]-[[1997]]);[[Lisa Riley]],([[1998]]-[[2002]]);[[Jonathan Wilkes]],([[2003]]);[[Harry Hill]],([[2004]]-present)\" \"[[Bruce Forsyth]] (1988-1990),[[Matthew Kelly]] (1991-1995),[[Darren Day]] (1996-1997)\" \"[[Groucho Marx]] [[George Fenneman]]\" \"[[Les Lye]] and [[#Cast|others]]\" \"[[Rodney Scott]][[Mark Famiglietti]][[Katherine Moennig]][[Ian Somerhalder]][[Kate Bosworth]]Ed Quinn\" \"Gene Wilder[[Teri Garr]][[Cloris Leachman]][[Marty Feldman]][[Peter Boyle]][[Madeline Kahn]][[Kenneth Mars]][[Gene Hackman]]\" \"''[[#Cast|See below]]''\" \"[[Sylvia Sidney]][[Henry Fonda]]\" \"New cast with each episode\" \"[[Freddie Prinze, Jr]][[Saffron Burrows]][[Matthew Lillard]][[Tch\u00e9ky Karyo]][[Jurgen Prochnow]][[David Warner]]Ginny HolderHugh QuarshireKen BonesJohn McGlynnRichard DillaneMark ProwleyDavif Fahm[[Simon MacCorkindale]]Fraser James\" * There isn't an (or at least not that I could find!) obvious distinction between usage of properties like dbpedia:name, dbpedia:title, rdfs:label. In general, to find the title of a film one must try all three of these, and give them some kind of priority (e.g. I'm trying title first, then name, and finally label) * Those Films which do not have infoboxes on their wikipedia page appear to be missing rdfs:label, dbpedia:name, or dbpedia:title triples; this one might be a product of me only looking in the infoboxes.nt set of data, and the data might be available in articles.nt (I've not looked into this one in detail :) * Finally (although I'm told this is something of an ongoing issue, I'll include it for completeness' sake :), encoded URIs aren't always parsed correctly by the DBpedia server, particularly when asking for those with encoded colons, apostrophes, etc. They tend to make it either 404 or 500, with no discernable pattern. I'm guessing this is the same apache proxy issue you solved previously for encoded parentheses, so hopefully might be an easy fix! Hope some of this is helpful - and a big thanks for all the hard work that's gone into the DBpedia project! Cheers, Peter uPeter, On 28 Jun 2007, at 14:43, P.L.Coetzee wrote: > * Those Films which do not have infoboxes on their wikipedia page The labels are in articles.nt (with an \"en\" language tag). This is simply the Wikipedia page title, so it's available for every film. Do you happen to have a few example URIs that show this particular problem? Cheers, Richard uHi Richard, Hmmm, sadly with the 'Bad Gateway' issues DBpedia seems to be having, I can't test and be *sure* which URIs I experience the issue with. If memory serves, (or the equivalent /page/ or /data/ dereferencings) failed, as did . I'll let you know for certain once I can test again :) That's fair enough re: Wikipedia page title; just makes it an extra chunk of file to operate with to obtain the info. Thanks for the response! Cheers, Peter uHi all, the DBpedia proxy is running again. Cheers Georgi Von: im Auftrag von P.L.Coetzee Gesendet: Fr 29.06.2007 10:31 An: Richard Cyganiak; P.L.Coetzee Cc: Betreff: Re: [Dbpedia-discussion] Proxy Probelms with DBpedia URIs Hi Richard, Hmmm, sadly with the 'Bad Gateway' issues DBpedia seems to be having, I can't test and be *sure* which URIs I experience the issue with. If memory serves, (or the equivalent /page/ or /data/ dereferencings) failed, as did . I'll let you know for certain once I can test again :) That's fair enough re: Wikipedia page title; just makes it an extra chunk of file to operate with to obtain the info. Thanks for the response! Cheers, Peter uThanks Georgi! So I was wrong about Manos, but the second URI (3:10 to Yuma, ) 404's. 'She Shoulda Said 'No'!'', causes an error 500 from Tomcat (wrapped around a Java error 400), with the stack trace pointing at Jena's SPARQL engine: HttpException: HttpException: 400 Bad Request: rethrew: HttpException: 400 Bad Request com.hp.hpl.jena.sparql.engine.http.HttpQuery.execCommon(HttpQuery.java:273) com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:167) com.hp.hpl.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:128) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execModel(QueryEngineHTTP.java:101) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execDescribe(QueryEngineHTTP.java:95) com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execDescribe(QueryEngineHTTP.java:93) de.fuberlin.wiwiss.pubby.RemoteSPARQLDataSource.execDescribeQuery(RemoteSPARQLDataSource.java:61) de.fuberlin.wiwiss.pubby.RemoteSPARQLDataSource.getResourceDescription(RemoteSPARQLDataSource.java:44) de.fuberlin.wiwiss.pubby.servlets.BaseServlet.getResourceDescription(BaseServlet.java:65) de.fuberlin.wiwiss.pubby.servlets.PageURLServlet.doGet(PageURLServlet.java:24) de.fuberlin.wiwiss.pubby.servlets.BaseResourceServlet.doGet(BaseResourceServlet.java:33) de.fuberlin.wiwiss.pubby.servlets.BaseServlet.doGet(BaseServlet.java:92) javax.servlet.http.HttpServlet.service(HttpServlet.java:690) javax.servlet.http.HttpServlet.service(HttpServlet.java:803) Hope these help isolate the issue for you - let me know if I can be of any further assistance! Cheers, Peter On 6/29/07, P.L.Coetzee < > wrote: [Hide Quoted Text] Hi Richard, Hmmm, sadly with the 'Bad Gateway' issues DBpedia seems to be having, I can't test and be *sure* which URIs I experience the issue with. If memory serves, (or the equivalent /page/ or /data/ dereferencings) failed, as did . I'll let you know for certain once I can test again :) That's fair enough re: Wikipedia page title; just makes it an extra chunk of file to operate with to obtain the info. Thanks for the response! Cheers, Peter uP.L.Coetzee wrote: Peter, If you want to hit the RDF Quad Store directly you can use: If you want to use the SPARQL Query Builder (there are two versions) then try: 1. you default mode at login time or pre execution via drop-down option*). 2. Some sample queries are available from: 1. (for .isparql pages i.e. server generated Linked Data Pages) 2. (for .rq files) I hope this helps. Kingsley uThanks Kingsley - where possible I'm trying to do the processing locally; hitting the sparql endpoint for every film we come across would slow things down a fair amount this end. Just to confirm the empirical evidence - the endpoint is automatically limited to 1000 results, isn't it? Cheers, Peter Quoting Kingsley Idehen < >: uwrote: Yes. Are you using a local DBpedia data set then? Kingsley uYes I am - processing the n-triples line by line with RAP, as I doubt (without any evidence whatsoever) this machine would handle loading the entire database at once :) Peter Quoting Kingsley Idehen < >: uPeter, Thanks for the detailed report. A great deal of special characters should work fine now, with the exception of three characters: ':' and '/' need to be fixed in the data extraction code. '\' doesn't work because of some bad interaction between Apache's mod_rewrite and mod_proxy that I'm unable to fix. Thanks again for the report. Richard On 29 Jun 2007, at 11:58, Peter Coetzee wrote: uRichard, Thanks for following up on this! When the extraction code is fixed to follow the MediaWiki rules for minting of URIs, will the resource's public URI change, or just those used in the triple store? That is to say, will the encoding in a URI pointing to be rewritten by Apache's proxy to ':' and thus still refer to the URI known internally as ? Or will any links containing this encoding break? I only ask as I'm in the process of creating a lot of seeAlso and sameAs links to DBpedia resources, and it'd be good to get them 'right' first time, if they're going to change some day soon :) Cheers, Peter Quoting Richard Cyganiak < >: uOn 2 Jul 2007, at 17:37, wrote: Both will change. No. The first encoding will simply disappear from the triple store. The second encoding will be in the triple store *and* available via dereferencing. Yes, links using the encoding will break. Please use the encoding, even though these URIs aren't in the triple store yet and cannot be dereferenced. This affects *only* the characters \":\" and \"/\", which currently are encoded in the triple store, but will *no longer* be encoded in the near future. Richard" "The Future Of Freebase in RDF" "uStatus of Freebase in RDF in Q4 2012 I'd like to share the status of my :BaseKB effort to convert Freebase to RDF, its future, and how it relates to other efforts. (See This is a long letter, but the takeaway is that I’m looking to put together some sort of a group to advance the development of :BaseKB, a product that converts Freebase data to logically sound RDF. If this interests you, keep reading. :BaseKB is the first and only complete and correct conversion of Freebase data to RDF. :BaseKB is possible because there is no fundamental philosophical difference between the Freebase data model and the RDF data model. Freebase spent $20 million or so developing graphd and the proprietary MQL language because they got started early, when SPARQL didn’t exist and when there wasn’t the vibrant competition between triple and quad store implementations that exists today. As a result of this early start, Freebase remains the world’s leading data wiki by a large margin. For a long time, Freebase has made it possible to retrieve a limited fraction of information in RDF format at rdf.freebase.com; this service, for the most part, made it possible to retrieve triples sharing a specific subject, as well as a number of “RDF molecules” involving CVT and mediator types. The official RDF version of Freebase is of limited use for a few reasons: most significantly, it has never been possible to query it with SPARQL, and second, the practical limitations of running a public API mean that Freebase can only publish a limited number of triples for any given subject. Although only a small fraction of concepts in Freebase are affected by this limit, these highly connected nodes play a critical role in the graph. Any ‘typical’ graph algorithm will traverse these nodes and thus give unsound results. (To be fair, this is a general problem with the ‘dereferencing’ idea in Linked Data and not the fault of Freebase.) Last April I released the early access version of :BaseKB, the first complete and correct conversion of Freebase to RDF. :BaseKB was supplied as a set of n-Triples files that could be loaded into a triple store and queried with SPARQL 1.1. This early access release contained a subset of information from Freebase, including all schema objects, all concepts that exist in Wikipedia, and all CVTs that interconnect these concepts. :BaseKB was made available for free under a CC-BY just like Freebase. :BaseKB was designed to be a project both competitive and complementary to DBpedia. Anyone trying to do projects with DBpedia will discover that intensive data cleaning is usually necessary to get correct query answers. Freebase’s mode of operation promises better curation than DBpedia and this translates into more correct answers with less data cleaning. Soon after, I announced the first release of :BaseKB Pro, a commercial product comprising all facts from Freebase, updated weekly on a subscription basis. :BaseKB Pro was not a commercial success. I didn’t sell a single license. Around the time this project was launched, I accepted a really great job offer so I’ve had limited time to work on :BaseKB and related projects. Not long after :BaseKB was announced, Google announced plans to publish an official RDF dump for Freebase in Summer 2012. This was a welcome development, however, this announcement is one reason why :BaseKB development was on the back burner this summer. In this time I’ve been happy to hear about certain work on reification at Freebase and I’ve also seen that DBpedia and WikiData are both evolving in the correct directions. As of October, no RDF dump has been published by Freebase, and the information I have leads me to believe that we can’t count on Google to provide a workable RDF dump of Freebase. Thus, I’m beginning to reassess the competitive landscape and to reposition :BaseKB. I haven’t followed the discussion list closely in the past few months, but I did see a report that a Google engineer had great difficulty loading a Freebase dump into a triple store and concluded this wasn’t practical to do without an exotic and expensive computer with more than 64G of RAM. I don’t know what data set was used, or what tools. I do know that I can easily load both :BaseKB and :BaseKB Pro on a Lenovo W520 laptop with 32 GB of RAM (bought from Crucial at a price much lower than OEM.) I demonstrated queries against :BaseKB Pro to individuals I met at the Semantic Technology Conference in San Francisco this July. Maybe it’s just hard to find good help these days. It takes me 1 hour to load :BaseKB into Virtuoso OpenLink on an older computer with 24 GB of RAM. I know the Franz people have loaded :BaseKB into Allegrograph and that others have loaded it into BigData. Unlike many popular RDF data sets, :BaseKB passes a test suite that includes a streaming version of Jena ‘eyeball’ for a high degree of compatibility with triple stores and other RDF tools. :BaseKB accomplishes a considerable level of compression relative to the quad dump and the (unusable) Linked Data representation. Nearly 8% of the quad dump consists of statements to the effect that all but a handful of objects are world writable. The Linked Data representation often uses two triples to represent what :BaseKB does in one, which results in hundreds of millions of triples of harmful overhead. Performance and compatibility with were baked into :BaseKB in the earliest stages of development. I’d like to say that I had some special insight into the problem of converting Freebase data, but no, like a certain Winston Churchill quote, I discovered the correct way to do it only after exhausting all of the other alternatives. I’m quite fortunate that I had some time where I could avoid the usual distractions involved with software projects and work out the math. One problem that bothered me early on was that I needed information from the schema to assign types to triples; if the schema told me that the object of a certain predicate was always an integer, I could use that to generate a triple with an integer object (that would sort, for instance, like an integer.) This process involved joining the predicate field with the subject of a quad in the schema field. However, a predicate like “/people/person/date_of_birth” shows up as “/m/04m1” in the subject field. This seemed to be a “chicken and egg” problem until I realized that, very simply, queries against Freebase work correctly when mid identifiers are used as primary keys. This has the disadvantage that the predicates are no longer machine readable, you have to write something like ?person fbase:m.04m1 ?date . in your SPARQL queries. Once I recognized that the problem of converting names in the dump to mids was the real problem, everything else was downhill. When you treat mids as primary queries, SPARQL queries give logically sound results against Freebase. The disadvantage of this, of course, is that queries are harder to write, but this is overcome by the basekb tools which rewrite names that appear in queries in the same way that the MQL query engine does. You can join other data sets to :BaseKB by grounding (smooshing) them through the tools. Alternatively, you can write OWL and RDFS statements that map Freebase predicates to well-known vocabularies like foaf. I think people have found this answer distasteful, so they’ve often tried to substitute “human readable” identifiers for the mids. Perhaps somebody really can make that work, but it’s a harder problem than it seems at first glance. For instance, important predicates have more than one name and you have to support all of them. You might think owl:sameAs would help here but it doesn’t, not with the standard interpretation. It’s probably not hard to make something that’s almost right but the QA work in making something “half-baked but good enough” is often vastly greater than that of making something perfect. I had to get the job done with a one-man army corp, so I used the simplest possible correct answer. I don’t know it for a fact, but I think it is quite possible that the development of a Freebase RDF dump inside of Google may have taken a wrong turn somewhere. I put development of :BaseKB on hold in July for several reasons, one of which is that I failed to sell any subscriptions for the :BaseKB Pro product. Around this I also received a great job offer which I subsequently accepted, so I have had less time to work on projects of this sort. :BaseKB was clearly ahead of the market last July, but I think it’s time to develop partnerships that will keep it relevant. Planned monthly releases of the :BaseKB product did not materialize because I haven’t had time to diagnose problems in the conversion process. For instance, one quad dump that I downloaded has a single quad in shard 13 that throws an exception in one processing stage, which causes the system to abort. Fixing this is a matter of downloading a recent quad dump and running it in the debugger; almost certainly it’s a very small problem. Somebody who wasn’t so obsessive about data quality would probably be happy to just eat the exception and lose the quad. Similarly, I understand changes have been made in how descriptions are implemented in Freebase and this may allow the simplification of the system, which previously required a plurality of processing stages to merge in descriptions from the simple topic dump as well as use a web crawler to get descriptions from (occasionally) documented schema objects. Now things should be simpler. I’m not able to maintain :BaseKB on a week by week basis, so I’m looking to the community for help. I’m working on a plan to put :BaseKB in the hands of people who can use it and I’m considering options such as licensing the technology behind it or donating it to an Open Source project. To do either I’ll want to have a credible plan to make :BaseKB sustainable. Please write me at if you are interested. I’ll talk a bit about the software that creates :BaseKB. It all revolves around a framework called “Infovore” which is a RDF-centric Map/Reduce framework that runs in multiple threads on a single computer. The framework, at the moment, is designed for high efficiency at processing Freebase-scale data. Unlike triple-store based system, Infovore’s streaming processing has minimal memory requirements. In fact, the most economical environment for running Infovore in AWS is a c1.medium instance with just 1.7 GB of RAM. It completes roughly 18 stages of processing in about 12 hours on a c1.medium to convert the contents of Freebase to correct RDF. I get better performance on my personal workstation, but operation and development of Infovore is very practical with the underpowered laptops that software developers seem to be stuck with so often. Infovore is written in Java and uses the Jena framework; reducers collect groups of statements together into models, upon which data transformations can be specified using SPARQL 1.1. Lately I’ve been working with much larger data sets in Hadoop and studying the Map/Reduce model used there and it seems very likely that the system could be made more scalable and clock time faster (at some increase in hardware cost per quad) by porting it to Hadoop. I’ve also considered adding support for Sesame and OWLIM-Lite which may give better performance and inference abilities. Infovore also contains a system for high-speed mapping of “human readable” Freebase identifiers to mid identifiers and an system for applying space-efficient in-memory graph algorithms to do tasks such as correct pruning of the complete :BaseKB Pro into the much more usable :BaseKB. (One of the many contradictions in my business plan was that :BaseKB is a more commercially usable product than :BaseKB Pro. :BaseKB takes advantage of the policies of Wikipedia that lead to a much closer mapping between concepts in the system p.o.v. to concepts in the minds of end users than exists in Freebase as a whole.) The pruning algorithm is capable of creating other kinds of consistent subsets; if you want a database of concepts connected with professional wrestling or things that Shakespeare might have known about, this is not science fiction, it’s just what Infovore can do and you can tell it exactly what to do by writing SPARQL queries. I know many more people use SQL databases than use SPARQL databases today, and Infovore has a good answer for them. SPARQL queries give answers in a tabular format exactly like a SQL table, so it’s quite easy to define Freebase to relational mappings with SPARQL queries. In all, Infovore is a good answer to the high memory consumption of triple stores; by preprocessing RDF data to contain exactly you need before loading it into a triple store, you can handle large data sets and still enjoy the flexibility of SPARQL and RDF. So, if you’d like to see a and up-to-date correct conversion of Freebase to RDF now, rather than (possibly) never, send me an email. ( ) Status of Freebase in RDF in Q4 2012 I'd like to share the status of my :BaseKB effort to convert Freebase to RDF, its future, and how it relates to other efforts. (See put together some sort of a group to advance the development of :BaseKB, a product that converts Freebase data to logically sound RDF. If this interests you, keep reading. :BaseKB is the first and only complete and correct conversion of Freebase data to RDF. :BaseKB is possible because there is no fundamental philosophical difference between the Freebase data model and the RDF data model. Freebase spent $20 million or so developing graphd and the proprietary MQL language because they got started early, when SPARQL didn’t exist and when there wasn’t the vibrant competition between triple and quad store implementations that exists today. As a result of this early start, Freebase remains the world’s leading data wiki by a large margin. For a long time, Freebase has made it possible to retrieve a limited fraction of information in RDF format at rdf.freebase.com; this service, for the most part, made it possible to retrieve triples sharing a specific subject, as well as a number of “RDF molecules” involving CVT and mediator types. The official RDF version of Freebase is of limited use for a few reasons: most significantly, it has never been possible to query it with SPARQL, and second, the practical limitations of running a public API mean that Freebase can only publish a limited number of triples for any given subject. Although only a small fraction of concepts in Freebase are affected by this limit, these highly connected nodes play a critical role in the graph. Any ‘typical’ graph algorithm will traverse these nodes and thus give unsound results. (To be fair, this is a general problem with the ‘dereferencing’ idea in Linked Data and not the fault of Freebase.) Last April I released the early access version of :BaseKB, the first complete and correct conversion of Freebase to RDF. :BaseKB was supplied as a set of n-Triples files that could be loaded into a triple store and queried with SPARQL 1.1. This early access release contained a subset of information from Freebase, including all schema objects, all concepts that exist in Wikipedia, and all CVTs that interconnect these concepts. :BaseKB was made available for free under a CC-BY just like Freebase. :BaseKB was designed to be a project both competitive and complementary to DBpedia. Anyone trying to do projects with DBpedia will discover that intensive data cleaning is usually necessary to get correct query answers. Freebase’s mode of operation promises better curation than DBpedia and this translates into more correct answers with less data cleaning. Soon after, I announced the first release of :BaseKB Pro, a commercial product comprising all facts from Freebase, updated weekly on a subscription basis. :BaseKB Pro was not a commercial success. I didn’t sell a single license. Around the time this project was launched, I accepted a really great job offer so I’ve had limited time to work on :BaseKB and related projects. Not long after :BaseKB was announced, Google announced plans to publish an official RDF dump for Freebase in Summer 2012. This was a welcome development, however, this announcement is one reason why :BaseKB development was on the back burner this summer. In this time I’ve been happy to hear about certain work on reification at Freebase and I’ve also seen that DBpedia and WikiData are both evolving in the correct directions. As of October, no RDF dump has been published by Freebase, and the information I have leads me to believe that we can’t count on Google to provide a workable RDF dump of Freebase. Thus, I’m beginning to reassess the competitive landscape and to reposition :BaseKB. I haven’t followed the discussion list closely in the past few months, but I did see a report that a Google engineer had great difficulty loading a Freebase dump into a triple store and concluded this wasn’t practical to do without an exotic and expensive computer with more than 64G of RAM. I don’t know what data set was used, or what tools. I do know that I can easily load both :BaseKB and :BaseKB Pro on a Lenovo W520 laptop with 32 GB of RAM (bought from Crucial at a price much lower than OEM.) I demonstrated queries against :BaseKB Pro to individuals I met at the Semantic Technology Conference in San Francisco this July. Maybe it’s just hard to find good help these days. It takes me 1 hour to load :BaseKB into Virtuoso OpenLink on an older computer with 24 GB of RAM. I know the Franz people have loaded :BaseKB into Allegrograph and that others have loaded it into BigData. Unlike many popular RDF data sets, :BaseKB passes a test suite that includes a streaming version of Jena ‘eyeball’ for a high degree of compatibility with triple stores and other RDF tools. :BaseKB accomplishes a considerable level of compression relative to the quad dump and the (unusable) Linked Data representation. Nearly 8% of the quad dump consists of statements to the effect that all but a handful of objects are world writable. The Linked Data representation often uses two triples to represent what :BaseKB does in one, which results in hundreds of millions of triples of harmful overhead. Performance and compatibility with were baked into :BaseKB in the earliest stages of development. I’d like to say that I had some special insight into the problem of converting Freebase data, but no, like a certain Winston Churchill quote, I discovered the correct way to do it only after exhausting all of the other alternatives. I’m quite fortunate that I had some time where I could avoid the usual distractions involved with software projects and work out the math. One problem that bothered me early on was that I needed information from the schema to assign types to triples; if the schema told me that the object of a certain predicate was always an integer, I could use that to generate a triple with an integer object (that would sort, for instance, like an integer.) This process involved joining the predicate field with the subject of a quad in the schema field. However, a predicate like “/people/person/date_of_birth” shows up as “/m/04m1” in the subject field. This seemed to be a “chicken and egg” problem until I realized that, very simply, queries against Freebase work correctly when mid identifiers are used as primary keys. This has the disadvantage that the predicates are no longer machine readable, you have to write something like ?person fbase:m.04m1 ?date . in your SPARQL queries. Once I recognized that the problem of converting names in the dump to mids was the real problem, everything else was downhill. When you treat mids as primary queries, SPARQL queries give logically sound results against Freebase. The disadvantage of this, of course, is that queries are harder to write, but this is overcome by the basekb tools way that the MQL query engine does. You can join other data sets to :BaseKB by grounding (smooshing) them through the tools. Alternatively, you can write OWL and RDFS statements that map Freebase predicates to well-known vocabularies like foaf. I think people have found this answer distasteful, so they’ve often tried to substitute “human readable” identifiers for the mids. Perhaps somebody really can make that work, but it’s a harder problem than it seems at first glance. For instance, important predicates have more than one name and you have to support all of them. You might think owl:sameAs would help here but it doesn’t, not with the standard interpretation. It’s probably not hard to make something that’s almost right but the QA work in making something “half-baked but good enough” is often vastly greater than that of making something perfect. I had to get the job done with a one-man army corp, so I used the simplest possible correct answer. I don’t know it for a fact, but I think it is quite possible that the development of a Freebase RDF dump inside of Google may have taken a wrong turn somewhere. I put development of :BaseKB on hold in July for several reasons, one of which is that I failed to sell any subscriptions for the :BaseKB Pro product. Around this I also received a great job offer which I subsequently accepted, so I have had less time to work on projects of this sort. :BaseKB was clearly ahead of the market last July, but I think it’s time to develop partnerships that will keep it relevant. Planned monthly releases of the :BaseKB product did not materialize because I haven’t had time to diagnose problems in the conversion process. For instance, one quad dump that I downloaded has a single quad in shard 13 that throws an exception in one processing stage, which causes the system to abort. Fixing this is a matter of downloading a recent quad dump and running it in the debugger; almost certainly it’s a very small problem. Somebody who wasn’t so obsessive about data quality would probably be happy to just eat the exception and lose the quad. Similarly, I understand changes have been made in how descriptions are implemented in Freebase and this may allow the simplification of the system, which previously required a plurality of processing stages to merge in descriptions from the simple topic dump as well as use a web crawler to get descriptions from (occasionally) documented schema objects. Now things should be simpler. I’m not able to maintain :BaseKB on a week by week basis, so I’m looking to the community for help. I’m working on a plan to put :BaseKB in the hands of people who can use it and I’m considering options such as licensing the technology behind it or donating it to an Open Source project. To do either I’ll want to have a credible plan to make :BaseKB sustainable. Please write me at if you are interested. I’ll talk a bit about the software that creates :BaseKB. It all revolves around a framework called “Infovore” which is a RDF-centric Map/Reduce framework that runs in multiple threads on a single computer. The framework, at the moment, is designed for high efficiency at processing Freebase-scale data. Unlike triple-store based system, Infovore’s streaming processing has minimal memory requirements. In fact, the most economical environment for running Infovore in AWS is a c1.medium instance with just 1.7 GB of RAM. It completes roughly 18 stages of processing in about 12 hours on a c1.medium to convert the contents of Freebase to correct RDF. I get better performance on my personal workstation, but operation and development of Infovore is very practical with the underpowered laptops that software developers seem to be stuck with so often. Infovore is written in Java and uses the Jena framework; reducers collect groups of statements together into models, upon which data transformations can be specified using SPARQL 1.1. Lately I’ve been working with much larger data sets in Hadoop and studying the Map/Reduce model used there and it seems very likely that the system could be made more scalable and clock time faster (at some increase in hardware cost per quad) by porting it to Hadoop. I’ve also considered adding support for Sesame and OWLIM-Lite which may give better performance and inference abilities. Infovore also contains a system for high-speed mapping of “human readable” Freebase identifiers to mid identifiers and an system for applying space-efficient in-memory graph algorithms to do tasks such as correct pruning of the complete :BaseKB Pro into the much more usable :BaseKB. (One of the many contradictions in my business plan was that :BaseKB is a more commercially usable product than :BaseKB Pro. :BaseKB takes advantage of the policies of Wikipedia that lead to a much closer mapping between concepts in the system p.o.v. to concepts in the minds of end users than exists in Freebase as a whole.) The pruning algorithm is capable of creating other kinds of consistent subsets; if you want a database of concepts connected with professional wrestling or things that Shakespeare might have known about, this is not science fiction, it’s just what Infovore can do and you can tell it exactly what to do by writing SPARQL queries. I know many more people use SQL databases than use SPARQL databases today, and Infovore has a good answer for them. SPARQL queries give answers in a tabular format exactly like a SQL table, so it’s quite easy to define Freebase to relational mappings with SPARQL queries. In all, Infovore is a good answer to the high memory consumption of triple stores; by preprocessing RDF data to contain exactly you need before loading it into a triple store, you can handle large data sets and still enjoy the flexibility of SPARQL and RDF. So, if you’d like to see a and up-to-date correct conversion of Freebase to RDF now, rather than (possibly) never, send me an email. ( )" "I mapped, now what?" "uDBPedia, I’ve got mapping edit permissions, so I’m dangerous now :-) A few questions 1) I updated When will it update? 2) When I edit Couldn't load PropertyMapping on page Mapping en:Infobox spaceflight. Details: Unknown mapping element PropertyMapping 3) My info box mapping I’m guessing a refresh needs to happen somewhere? 4) When will the following SPARQL query return something at SELECT distinct ?sat WHERE { ?sat a . } limit 10 ( Thanks for the help! Tim DBPedia, I’ve got mapping edit permissions, so I’m dangerous now :-) A few questions1) I updated Tim uHi Timothy, Welcome to the DBpedia Mappings Community! I'll try to help you as much as I can until someone with more knowledge answers. 1) It could be that the OntologyFeeder service is down, let's wait a bit. If the changes are not mirrored until tomorrow there is definitely something wrong. 2) You mapped it this way {{TemplateMapping The correct way is this: {{TemplateMapping | mapToClass = ArtificialSatellite |mappings= {{PropertyMapping | templateProperty = COSPAR_ID | ontologyProperty = cosparId }} }} 3) This is strange, it seems the Mappings Wiki does not have this Infobox in its system, it's not present in although it's one year old in Wikipedia. 4) It could be that the live extraction is down. Cheers, Alexandru On Thu, May 8, 2014 at 7:13 PM, Timothy Lebo < > wrote: uAlexandru, On May 8, 2014, at 1:57 PM, Alexandru Todor < > wrote: Thanks :-) Sounds good. I saw this style around, just thought it was another option. I’ll use it. Probably related to Ah. A status indication would be helpful :-/ Regards, Tim uDBPedia, 1) and 4) below haven’t “updated” after a day’s wait. Did I do something wrong? Are my expectations incorrect? Is a service down? Thanks, Tim On May 8, 2014, at 1:13 PM, Timothy Lebo < > wrote: uI can't say if the service is down, it's probable. Please continue mapping, I'm sure an any bugs/issues will be fixed in a couple of days. The mappings wiki works and any mappings you make will find their way into DBpedia, and are greatly appreciated by the community. Cheers, Alexandru I can't say if the service is down, it's probable. Please continue mapping, I'm sure an any bugs/issues will be fixed in a couple of days. The mappings wiki works and any mappings you make will find their way into DBpedia, and are greatly appreciated by the community. Cheers, Alexandru uHi Tim, Sorry for the delayed response, some details about the issues you mentioned: 1) I think this is related to #165 . The ontology feeder is a separate component that hasn't been updated for a while. I am working on a fix ( 2) Alexandru pointed the right syntax 3) As mentioned in the issue The statistics were based on the dump of April 2013 and this infobox appeared in June. So, this is not a bug but we should update the statistics more often :) We will need to upgrade the mappings server to calculate them directly there. We 'll see to that. 4) DBpedia Live has 3 feeds: 1) wikipedia edits, 2) mapping changes and 3) pages unmodified more that 1 month. Everything gets in a queue and we process the queue with the same priority. I recently deployed the abstract extraction in Live but due to the Lua scripts that are now used in Wikipedia the page extraction became too slow that the framework tries keeps up with #1, creates too many apache threads and makes the whole site quite unresponsive. Your query now returns results [1] but I just disabled the abstract extraction (again) until we scale things a bit. btw, you can also visit the stats page for status indicatror: As Alexandru mentioned, your mappings are needed even if they do not reach the Live server fast :) They can be used from static extractions and the next DBpedia release Cheers, Dimitris [1] On Sat, May 10, 2014 at 8:18 PM, Alexandru Todor < >wrote: uThanks for the update, Demitris. That sure is a lot of moving parts. Could you explain why my query at SELECT (count(distinct ?sat) as ?c) WHERE { ?sat a . } returns “11”, while the page shows a little more than 2,000 pages that use the template? Thanks, Tim On May 12, 2014, at 2:32 AM, Dimitris Kontokostas < > wrote: uChanges affected from mapping changes have lower priority than wikipedia edits. These changes usually involve huge number of article updates (e.g. 2K in this case) and there are probably other mapings updates pending At the moment Live is ~1 day back (11/5) but processes ~ 1-1.5K pages/minute so you should see the number increasing gradually after a while. Also consider adding more property mappings in the spaceflight infobox Cheers, Dimtiris On Mon, May 12, 2014 at 3:50 PM, Timothy Lebo < > wrote: uDimitris, On May 12, 2014, at 9:22 AM, Dimitris Kontokostas < > wrote: Yes, I can see how this would cause a burden. Thanks for explaining it. Thanks. Is there anything that conveys the state of the queue? That’d be fun to watch :-) Or, is there a place where I can contribute some development? I was going to, but the “properties not yet mapped” link led to a stack dump: -Tim uOn Mon, May 12, 2014 at 4:31 PM, Timothy Lebo < > wrote: Only the statistics page at the moment ( But, we have an open issue on this and no one assigned yet :) You can also do this manually like you did with the osparId but you can also wait I hope in a couple of weeks we'll have new stats re-deployed Cheers, Dimitris uI’d be happy to discuss this, first with an eye towards just *showing* what’s in the queue. Should we do this off-list (or, take it to the dev list)? Thanks. I’m just trying to stitch a minimal example end-to-end at the moment. I’ll hold off. Regards, Tim At the moment Live is ~1 day back (11/5) but processes ~ 1-1.5K pages/minute so you should see the number increasing gradually after a while. Thanks. Is there anything that conveys the state of the queue? That’d be fun to watch :-) Or, is there a place where I can contribute some development? Only the statistics page at the moment ( Tim" "Generating DBpedia Lookup index for specific data" "uI'm trying to generate DBpedia Lookup index for specific data (not for the entire DBpedia and Wikipedia data ). How can I get Wikipedia dump data for my specific data ? Does DBpedia Lookup support generating index for other languages such as French and Arabic ? Regards. I'm trying to generate DBpedia Lookup index for specific data (not for the entire DBpedia and Wikipedia data ). How can I get Wikipedia dump data for my specific data ? Does DBpedia Lookup support generating index for other languages such as French and Arabic ? Regards. uYou mean that you already know the articles you want from Wikipedia? Then it could be a script that is filtering out triples associated with resources outside your set from our dumps. or in a similar way, generate a wikipedia dump with your articles only. We don't have such a script but it should easy to create. Not aware about lookup in other languages. AFAIK, lookup reuses some parts from Spotlight and spotlight is available in other languages so I'd assume it is possible but Pablo or Max can correct me Best, Dimitris On Sun, Nov 16, 2014 at 9:01 AM, Nasreddine Cheniki < uIt should work in other languages, as far as I can see. You may need to choose the right analyzer for Lucene, and make sure the code doesn't break with non-ascii stuff. I don't think I ever tried to index anything other than English with Lookup. Other developers may have. On Nov 19, 2014 7:48 AM, \"Dimitris Kontokostas\" < > wrote:" "Software license" "uHi, I was wondering with which license you are releasing your software, i.e. the software that extracts information from Wikipedia and make list like this one(1). Furthermore, do you believe that a page like the one above(1) is a derivative work from Wikipedia? or it is derived from DBpedia? or both? or neither? I should use a CC-BY-SA license (Wikipedia) or have the permission from DBpedia authors (I am saying so because I don't know the terms of use of DBpdea license) if I want to make a derivative work from a list extracted from DBpedia? Just as an example: a suite of maps (for example, for turism) for navigators using the list of German cities in which Rhine river passes, or the list of Musems in Berlin. This is a derivative work from Wikipedia? or it is derived from DBpedia? or both? or neither? (once again) Thank for your answers. Cristian (1) ?fc=30 uCristian Consonni wrote: uIn my opinion, the DBpedia mappings and extracted information from Wikipedia are derivative works of the CC-BY-SA licensed Wikipedia templates and content, so CC-BY-SA probably should not be violated in that respect. You don't need written permission to use CC-BY-SA licensed materials, it is offered equally to everyone automatically. The engine that uses the mappings could be licensed under another license though. Any private content that you use the extraction framework for with your custom mappings should not relate to Wikipedia or DBpedia. Cheers, Peter On 16 March 2010 02:19, Cristian Consonni < > wrote: uFor the sake of completeness, I am not about starting a new commercial activity using DBpedia (I would be happy to do so, but I am not), this was more like a \"philosophical\" question. I am a member of Wikimedia Italia (Italian chapter of the Wikimedia Foundation) and in a recent discussion on CC Italian mailing list[1] (which is unrelated from WMF, but many people have common ideas ;-)) somebody was wondering if CC-BY-SA was suitable for databases (like Open Street Map) and how database-like works built from Wikipedia (or OSM or any other free project) should be licensed. 2010/3/15 Peter Ansell < >: IMHO, this is a fundamental point. I meaneven if DBpedia software would be released using a viral license[2] (like GPL, for instance) and also Wikipedia license has this property (\"SA\" condition), probably a \"derivative\" work in the sense above could be released with a non free license. But this isn't this likely to betray the spirit of the original licenses? On the other hand a software *or something built on the top of a research engine using only information and not the original software* is a well different product either from an encyclopaedia or the research engine itself. How can you call it a \"derivative work\" and why it should it be affected by the virality of the \"original work\"? On an different but not uncorrelated topic, I repeat I haven't understood yet with what license DBpedia software is released (i.e. DBpedia engine). Thank you for your time. Cristian [1] a pointer (if you can read italian) [2] a viral license requires the derivative work to be released with the same license of the original uOn 16 March 2010 08:54, Cristian Consonni < > wrote: The information is mapped directly from Wikipedia. The mapping files, and the resulting data have very direct information links to the content in each of the Wikipedia's, so the use of the DBpedia software on Wikipedia CC-BY-SA data will create data files that need to be licensed under CC-BY-SA. If the engine is sufficiently removed from the Wikipedia scenario, ie, if it could be used on any MediaWiki dump, then it probably wouldn''t be classed as a derivative work. It depends on whether there is hard coded information in the software that relies on Wikipedia I guess as to whether it fits or not, but it is a vague area in many senses. According to the sourceforge project page [1], the engine is released into the Public Domain, but you would have to look at the individual files for their copyright notices to confirm that there aren't other licenses, and check the dependencies to make sure there are no viral dependencies, like GPL for instance. Public Domain is definitely not viral, but it is open source if people want to extend it and they choose to keep the Public Domain or use another open license. Cheers, Peter [1] u2010/3/16 Peter Ansell < >: Thanks for your clear answer. One more question, if you use GPLed software it is needed (virality) that you release your derivative work under GPL. I am concluding that DBpedia is released *both* on PD and under GPL. Is that corect? To be clear: you are free to release your code under the conditions you like but if you want to make a derivative work from GPLed software you have to release a \"copy\" (call it a \"version\") of your \"derivative work\" under GPL. This is true (AFAICT) for CC-SA licenses, also. Cristian p.s.: I am sorry to be late with my answer =). uHoi, When you use an application, it does not follow that the content becomes part of that license. It is perfectly ok to create proprietary fiction writing OpenOffice or conversely writing public domain fiction using MS/Word. The licensing of the material generated by DBpedia has Wikipedia as its origin and its content is now available under the CC-by-sa as well as GFDL. When you generate the DBpedia content, there are people who consider it to be a derivative and consequently the combo of licenses would be appropriate, there are also people who consider the process an abstraction of facts and this allows for a different approach. Facts as such cannot be licensed. As a collection facts can be licensed. However, given that the DBpedia software is available under a free license, claiming copyright on such a collection is problematic because everyone is invited to use the software and mutatis mutandis the resulting collection may be different. My personal belief is that the approach of making the facts Free is the right approach. Thanks, GerardM On 22 March 2010 10:50, Cristian Consonni < > wrote: u2010/3/22 Gerard Meijssen < >: I perfectly agree with this particular example, the point here (in DBpedia, OSM and other software alike) is that the definition of \"derivative work\" from such projects is more difficult to give in some cases: as Peter pointed out: \"It depends on whether there is hard coded information in the software that relies on Wikipedia I guess as to whether it fits or not, but it is a vague area in many senses.\". The original and most fitting example was \"a software for a navigator (like TomTom, for instance) which use information extracted from OSM\", logically this work does rely on OSM data but the software itself could hardly be considered a derivative work. I believe that DBpedia is in the same situation. Note that this problem can be applied \"recursively\" on products derived from data extracted through DBpedia, so I perfectly agree with Peterall this is very vague and probably, IMHO, the existing free licenses aren't designed for this (a lawyer expert on free licenses could be more precise than me, though). My last mail was about the fact that if DBpedia use GPLed software being it (as a software) a derived work from other software it should be released under GPL to comply with the GPL itself. Or at least, as I was explaining in my previous email a copy of it should be licensed under GPL (other can be licensed with different conditions). But, here again, the problem is the definition of derivative work. Here, for instance a FAQ from GNU site[1]: If a program combines public-domain code with GPL-covered code, can I take the public-domain part and use it as public domain code? You can do that, if you can figure out which part is the public domain part and separate it from the rest. If code was put in the public domain by its developer, it is in the public domain no matter where it has been. I am sorry to have started such a mess! -_- Cristian [1] gpl-faq.html#CombinePublicDomainWithGPL uChristian, Gerard, all u2010/3/22 Ted Thibodeau Jr < >: This is a good point, the \"list of cities in Germany\" is not inherently copyrightable IMHO, but as a collection of data extracted from a (common) source it is likely it is. The most funny thing is that if this sounds strange to you think about the fact that with traditional (sic) copyright you can't access those data. uCristian Consonni wrote:" "Strategies to download subsets of DBPedia" "uWhat is the most efficient (CPU and network time) of extracting subsets of DBPedia? I am only interested in and the first level of relationships. First, I want to work on the data dumps provided either by DBPedia or Wikipedia (via the extraction framework, maybe). I realise I could do what I want via with this: - It adds load on dbpedia.org - dbpedia.org often appears to have maintenance periods - There are limits placed on the number of results from dbpedia.org However, the DBPedia dumps themselves have one big problem: they are so massive it appears to take days to do anything with them. Loading them into Apache Jena for instance takes ages. I also tried a little sed'ing and awk'ing of the file but with little success. How is everyone else dealing with subsets of the data dumps? Is it possible to configure the extraction framework to ignore input records, or maybe output to something other than text n-tuples which would then be easier to slice and dice (e.g. output to SQL, then perform a query?). Thanks, Dan What is the most efficient (CPU and network time) of extracting subsets of DBPedia? I am only interested in < Dan uI can report my progress on this front. I’ve got a system in place that moves Freebase dumps, recompresses them and stores them in the AMZN cloud. I can suck in DBpedia data the same way. I’m hadoopifying my Infovore tools so I can do my preprocessing, parallel super eyeball and be able to run basic reports. The plan is to keep most of the results in requester-pays S3 buckets, which can be accessed for free in the AWS, particularly with Elastic MapReduce. The first release of the system will focus about rules that apply to individual triples, but it’s not a difficult extension of that to build something that only copies records where the subjects are kings and queens, about sealing wax, whatever. As a rough idea of costs and time involved, it takes around two hours, $2 in transfer cost and about $1 in CPU to package the dump for EMR. It will take more EMR costs to clean the data up and probably compress it to speed up your Q’s A somewhat tuned system could deliver you a custom subset of DBpedia in an hour or two on a cluster that costs about as much to run as a minimum wage employee. You might then need to transfer the files out of AMZN but TANSTAFFL. From: Dan Gravell Sent: Monday, July 15, 2013 9:34 AM To: Subject: [Dbpedia-discussion] Strategies to download subsets of DBPedia What is the most efficient (CPU and network time) of extracting subsets of DBPedia? I am only interested in and the first level of relationships. First, I want to work on the data dumps provided either by DBPedia or Wikipedia (via the extraction framework, maybe). I realise I could do what I want via - It adds load on dbpedia.org - dbpedia.org often appears to have maintenance periods - There are limits placed on the number of results from dbpedia.org However, the DBPedia dumps themselves have one big problem: they are so massive it appears to take days to do anything with them. Loading them into Apache Jena for instance takes ages. I also tried a little sed'ing and awk'ing of the file but with little success. How is everyone else dealing with subsets of the data dumps? Is it possible to configure the extraction framework to ignore input records, or maybe output to something other than text n-tuples which would then be easier to slice and dice (e.g. output to SQL, then perform a query?). Thanks, Dan uThanks Paul. The end goal of this data is import into AWS SimpleDB and CloudSearch (for the strings), as a matter of fact. What I was doing though was having all of my data sources (also: Discogs, MusicBrainz) export to a common-ish JSON structure which then gets uploaded to the above services. I was keen on ways of just working on the dbpedia tuples from the download. I'm still looking at the feasibililty of this. One grep of the nt file on a consumer SSD gets through the file in just over two minutes, which bodes well. I will continue with this line of investigation. The other thing to investigate is writing custom formatters (I think they're called) for the extraction frameworknot sure how 'pluggable' that is yet though. On Mon, Jul 15, 2013 at 5:01 PM, Paul A. Houle < > wrote: uI’ll keep this use case in mind. Really there is no need to use Hadoop to handle English DBpedia data with Infovore because DBpedia-en isn’t as big as Freebase. It may be possible to design the cutter so at least parts of it can be run w/o Hadoop. On the other hand I think there is also going to be more interest in other language Dbpedias, so now “metamemomic” work means facing up to bigger databases. For instance, en-Wikipedia has pictures of perhaps half of the “pretty places” in the world that one might want photos of, travel to, whatever and it would be great to call on them too. From: Dan Gravell Sent: Tuesday, July 16, 2013 4:26 AM To: Paul A. Houle Cc: Subject: Re: [Dbpedia-discussion] Strategies to download subsets of DBPedia Thanks Paul. The end goal of this data is import into AWS SimpleDB and CloudSearch (for the strings), as a matter of fact. What I was doing though was having all of my data sources (also: Discogs, MusicBrainz) export to a common-ish JSON structure which then gets uploaded to the above services. I was keen on ways of just working on the dbpedia tuples from the download. I'm still looking at the feasibililty of this. One grep of the nt file on a consumer SSD gets through the file in just over two minutes, which bodes well. I will continue with this line of investigation. The other thing to investigate is writing custom formatters (I think they're called) for the extraction frameworknot sure how 'pluggable' that is yet though. On Mon, Jul 15, 2013 at 5:01 PM, Paul A. Houle < > wrote: I can report my progress on this front. I’ve got a system in place that moves Freebase dumps, recompresses them and stores them in the AMZN cloud. I can suck in DBpedia data the same way. I’m hadoopifying my Infovore tools so I can do my preprocessing, parallel super eyeball and be able to run basic reports. The plan is to keep most of the results in requester-pays S3 buckets, which can be accessed for free in the AWS, particularly with Elastic MapReduce. The first release of the system will focus about rules that apply to individual triples, but it’s not a difficult extension of that to build something that only copies records where the subjects are kings and queens, about sealing wax, whatever. As a rough idea of costs and time involved, it takes around two hours, $2 in transfer cost and about $1 in CPU to package the dump for EMR. It will take more EMR costs to clean the data up and probably compress it to speed up your Q’s A somewhat tuned system could deliver you a custom subset of DBpedia in an hour or two on a cluster that costs about as much to run as a minimum wage employee. You might then need to transfer the files out of AMZN but TANSTAFFL. From: Dan Gravell Sent: Monday, July 15, 2013 9:34 AM To: Subject: [Dbpedia-discussion] Strategies to download subsets of DBPedia What is the most efficient (CPU and network time) of extracting subsets of DBPedia? I am only interested in and the first level of relationships. First, I want to work on the data dumps provided either by DBPedia or Wikipedia (via the extraction framework, maybe). I realise I could do what I want via - It adds load on dbpedia.org - dbpedia.org often appears to have maintenance periods - There are limits placed on the number of results from dbpedia.org However, the DBPedia dumps themselves have one big problem: they are so massive it appears to take days to do anything with them. Loading them into Apache Jena for instance takes ages. I also tried a little sed'ing and awk'ing of the file but with little success. How is everyone else dealing with subsets of the data dumps? Is it possible to configure the extraction framework to ignore input records, or maybe output to something other than text n-tuples which would then be easier to slice and dice (e.g. output to SQL, then perform a query?). Thanks, Dan uHi Dan, On Tue, Jul 16, 2013 at 11:26 AM, Dan Gravell < >wrote: They 're pretty pluggable already. There are 2 extra formatters for DBpedia live [1] but both are used manually in the code. You can adapt the PolicyParser [2] class to enable them in the configuration file for the dump extraction. Best, Dimitris [1] [2] uThanks Dimitris. I'm making fairly good progress right now by simply brute forcing a few scans over the .nt file and filtering out the lines of interest. On a consumer grade SSD this takes about 7 minutes per scan, and as this is a batch, non-interactive nor user facing job, this is acceptable. I hope to write up and maybe publish what I've done (my awk/sed skills fall short so I ended up scripting something in Scala). Dan On Thu, Jul 18, 2013 at 9:44 AM, Dimitris Kontokostas < >wrote: uDear all We are pleased to announce our Slicing approach. Since many of the LOD datasets are quite large and despite progress in RDF data management their loading and querying within a triple store is extremely time-consuming and resource-demanding. To overcome this consumption obstacle, we propose a process inspired by the classical Extract-Transform-Load (ETL) paradigm, RDF dataset slicing. You can find further information here: During the following month, the source code will be publicly available. On Thu, Jul 18, 2013 at 10:54 AM, Dan Gravell < >wrote:" "DBpedia ontology - full list of classes" "uHello, I'm trying to automatically parse the DBpedia ontology, but I've come across some inconsistencies between the published lists of ontology classes. As you can see below, some classes appear only in one list, and some only in the other. Do you know who's in charge of maintaining the ontology? And can anyone tell me which of these resources is the official, updated one? (or maybe I should use some third resource?) *t1 = ontology classes read from *t2 = ontology classes read from *print t1-t2* set([' ' ' ' ' ' 'http://dbpedia.org/ontology/TimePeriod', 'http://dbpedia.org/ontology/CyclingCompetition', 'http://dbpedia.org/ontology/Ligament', 'http://dbpedia.org/ontology/Opera', 'http://dbpedia.org/ontology/Enzyme', 'http://dbpedia.org/ontology/Novel', 'http://dbpedia.org/ontology/Manhwa', 'http://dbpedia.org/ontology/RailwayStation', 'http://dbpedia.org/ontology/Comics', 'http://dbpedia.org/ontology/Anime', 'http://dbpedia.org/ontology/MovieGenre', 'http://dbpedia.org/ontology/ControlledDesignationOfOriginWine', 'http://dbpedia.org/ontology/Cartoon', 'http://dbpedia.org/ontology/Manga']) *print t2-t1* set(['http://dbpedia.org/ontology/Instrument', 'http://dbpedia.org/ontology/ComicBook']) Thanks, Omri Hello, I'm trying to automatically parse the DBpedia ontology, but I've come across some inconsistencies between the published lists of ontology classes. As you can see below, some classes appear only in one list, and some only in the other. Do you know who's in charge of maintaining the ontology? And can anyone tell me which of these resources is the official, updated one? (or maybe I should use some third resource?) t1 = ontology classes read from http://mappings.dbpedia.org/server/ontology/classes/ t2 = ontology classes read from http://downloads.dbpedia.org/3.8/dbpedia_3.8.owl.bz2 print t1-t2 set([' http://dbpedia.org/ontology/Genre ', ' http://dbpedia.org/ontology/RadioProgram ', ' http://dbpedia.org/ontology/Manhua ', ' http://dbpedia.org/ontology/Wine ', ' http://dbpedia.org/ontology/LightNovel ', ' http://dbpedia.org/ontology/HandballPlayer ', ' http://dbpedia.org/ontology/TimePeriod ', ' http://dbpedia.org/ontology/CyclingCompetition ', ' http://dbpedia.org/ontology/Ligament ', ' http://dbpedia.org/ontology/Opera ', ' http://dbpedia.org/ontology/Enzyme ', ' http://dbpedia.org/ontology/Novel ', ' http://dbpedia.org/ontology/Manhwa ', ' http://dbpedia.org/ontology/RailwayStation ', ' http://dbpedia.org/ontology/Comics ', ' http://dbpedia.org/ontology/Anime ', ' http://dbpedia.org/ontology/MovieGenre ', ' http://dbpedia.org/ontology/ControlledDesignationOfOriginWine ', ' http://dbpedia.org/ontology/Cartoon ', ' http://dbpedia.org/ontology/Manga' ]) print t2-t1 set([' http://dbpedia.org/ontology/Instrument ', ' http://dbpedia.org/ontology/ComicBook' ]) Thanks, Omri uHi Omri, please take into consideration that the mappings wiki is continuously changing. On 10/29/2012 04:02 PM, Omri Oren wrote: this is because those classes are relatively new, so they were not included in the ontology that was published with DBpedia 3.8. For instance if you check the history of class \"Genre\", you can see that it is dated to August 20th, 2012. For class \"Instrument\" it is already there, you can find it here [1]. Class \"ComicBook' is removed from the mappings wiki. [1] http://mappings.dbpedia.org/index.php/OntologyClass:Instrument uAnd as for who is responsible for maintaining the ontology, the answer is: you. Well, you, me and anybody else interested in it. It can be edited via the wiki, as Mohamed pointed out. On Mon, Oct 29, 2012 at 9:02 PM, Mohamed Morsey < > wrote:" "Reading William Shakespeare using dbpedia" "uMorning, I've just started exploring dbpedia in a couple of ways but have come up with an error. This is my first foray into the Semantic Web so apologies if it is a naive question. I'm trying to read both and to using the information in the open Shakespeare project (www.openshakespeare.org) and the Milton project. I tried using Python's rdflib to parse the page and then the Simile Babel convertor to experiment in JSON but keep getting an error. Babel, when I use retrieve and convert: unqualified attribute 'lang' not allowed [line 1, column 68] and Python throws back: SAXParseException: tag. I'd be grateful for any help or advice. Thanks, Iain Iain Emsley uHello, It sounds like you are trying to parse the HTML representation at rather than the RDF/XML representation at . Try using the latter URI, and setting the HTTP header Accept: application/rdf+xml so that you get RDF/XML instead of being redirected to the HTML version. If you don't know how to set an HTTP header, the following URI will give you the raw RDF/XML: Cheers, Ryan On Sun, Sep 7, 2008 at 3:48 AM, Iain Emsley < > wrote: uHey, uri=http%3A%2F%2Fdbpedia.org&query;=DESCRIBE+%3C > rce/William_Shakespeare%3E&output;=xml Or use The /resource namespace does 303 redirects based on content negotiation either to /page (html-document) or /data (rdfxml-document). Cheers, Georgi" "Stale Data Question" "uHello. I saw that DBpedia Live is once again receiving updated data however I'm seeing old data for some queries. Here's an example using SELECT ?teamName ?label ?coach ?coachLabel WHERE { ?teamName rdfs:label \"Indiana Pacers\"@en . ?teamName rdfs:label ?label . ?teamName dbo:coach ?coach. ?coach rdfs:label ?coachLabel . } This returns a coachLabel value of \"Frank Vogel\"@en According to been coach since May. Is live.dbpedia.org still working through past updates or is there a different query I should be running? Thanks! Jason Hello. I saw that DBpedia Live is once again receiving updated data however I'm seeing old data for some queries. Here's an example using Jason uThe query is correct and Live is indeed running again but since it was stopped for a few months it is currently re-indexing all pages atm it is ~60-70% and I expect it to be complete by next week if you look at dbo:wikiPageExtracted property where it states that the last extraction of that page was at 2016-01-30 17:24:04 so your query hit the ~30% that is still stale On Thu, Jul 21, 2016 at 9:48 PM, Jason Hart < > wrote:" "StrepHit 1.0 Beta Release" "uTo whom it may interest, Full of delight, I would like to announce the first beta release of *StrepHit*: TL;DR: StrepHit is an intelligent reading agent that understands text and translates it into *referenced* Wikidata statements. It is a IEG project funded by the Wikimedia Foundation. Key features: -Web spiders to harvest a collection of documents (corpus) from reliable sources -automatic corpus analysis to understand the most meaningful verbs -sentences and semi-structured data extraction -train a machine learning classifier via crowdsourcing -*supervised and rule-based fact extraction from text* -Natural Language Processing utilities -parallel processing You can find all the details here: If you like it, star it on GitHub! Best, Marco" "dbpedia/snorql and dbpedia query through jena" "uHello, What is going on with Your are here: This page doesn't exist yet. Maybe you want to create it? I am trying to execute the following query through Jena. String prefixes = \"PREFIX owl: \n\" +         \"PREFIX xsd: \n\"+         \"PREFIX rdfs: \n\"+         \"PREFIX rdf: \n\"+         \"PREFIX foaf: \n\"+         \"PREFIX dc: \n\"+         \"PREFIX : \n\"+         \"PREFIX dbpedia2: \n\"+         \"PREFIX dbpedia: \n\"+         \"PREFIX skos: \n\"+          \"PREFIX dbpedia-owl: \n\"+         \"PREFIX geo: \n\";                 String sparqlQueryString = prefixes +         \"SELECT ?resource ?page ?icon ?lat ?lon WHERE {\n\" +         \"?resource foaf:name \\"\"+name+\"\\"@en ;\n\" +          \" foaf:page ?page ;\n\" +         \"dbpedia-owl:thumbnail ?icon ;\n\" +         \"geo:lat ?lat ;\n\" +         \"geo:long ?lon.\n\" +         \"}\";                 Query query = QueryFactory.create(sparqlQueryString);         QueryExecution qexec = QueryExecutionFactory.sparqlService(\"         try {                        ResultSet results = qexec.execSelect(); } and I get no results but when i execute the same query in Could you help me please?? Hello, What is going on with response:" "equivalent class specification" "uIs it possible to specify that a class in the DBpedia ontology is equivalent to a class in an external ontology? I see the \"equivalentClass\" construct on the ontology class pages but I don't see an example of a connection outside of this ontology. Same question for properties. Almost the same question again for links to external URIs. When an article has a hyperlink within a template to an external URI (for example a node in the gene ontology), how can we represent this connection in the mapping ? thanks -Ben Is it possible to specify that a class in the DBpedia ontology is equivalent to a class in an external ontology?  I see the 'equivalentClass' construct on the ontology class pages but I don't see an example of a connection outside of this ontology. Same question for properties. Almost the same question again for links to external URIs.  When an article has a hyperlink within a template to an external URI (for example a node in the gene ontology), how can we represent this connection in the mapping ? thanks -Ben uHi Ben, On 13.01.2012 20:33, Benjamin Good wrote: find an example for owl:equivalentClass statements on the Person class: See I would suggest using foaf:page to connect those external pages, depends on their relation to the entity. Cheers, Anja uIn the Person example you sent, the mapping is achieved with: owl:equivalentClass = foaf:Person, schema:Person Where do you declare the base URI for the foaf and schema prefixes? Can we also extend the prefix declarations somehow or can we use complete URIs where you are using namespace shorthand? thanks very much -Ben On Mon, Jan 16, 2012 at 3:49 AM, Anja Jentzsch < > wrote:" "Paper: Autonomously Semantifying Wikipedia" "uHi all, the paper \"Autonomously Semantifying Wikipedia\" [1] by Fei Wu and Daniel S. Weldjust won the best paper prize at CIKM (ACM Sixteenth Conference on Information and Knowledge Management) in Lisbon, Portugal. I just had a short glance at the paper. Their work seems to apply Machine Learning techniques in order to improve the coverage and quality of Wikipedia. Regarding the relationship with Wikipedia they say: The DBpedia system which extracts information from existing infoboxes within articles and encapsulate them in a semantic form for query. In contrast, KYLIN populates infoboxes with new attribute values. Cheers, Sören [1] wu-cikm07.pdf uThanks for pointing this out. I really enjoyed reading that paper and it made me wonder if there are any plans to feed some of the DBpedia info back into Wikipedia. I would imagine that DBpedia is starting to accumulate some topics which are not available in Wikipedia yet . So it might be nice to contribute some of that knowledge back so that the Wikipedia folks can fill in those topics with more data. Shawn Sören Auer wrote:" "LDIF: A tool to process DBpedia dumps (blatant advertising)" "uHi all, For those of you who are processing subsets of DBpedia dumps, I wanted to point out LDIF (Linked Data Integration Framework), a new tool currently being developed by FU Berlin and MES. It allows to integrate local and remote Linked Data into a common schema and provides for identity resolution. We are currently looking for real-world use cases to exercise the framework. I'm available to help you get started and to discuss feature requests. LDIF lives here: Cheers, Christian" "Updated dbpedia Dataset released!" "uHi all, a new version of the dbpedia dataset is available for download from the dbpedia website: The dataset has grown from 25 to 31 million triples and includes now extended abstracts in 10 languages as well as additional links to several external data sources. The new dataset includes: - better short abstracts (stuff like unnecessary brackets has been removed from the abstracts) - new extended abstracts for each concept (up to 3000 characters long) - abstracts in 10 languages (German, French, Spanish, Italian, Portuguese, Polish, Swedish, Dutch, Japanese, Chinese). - 2.8 million new links to external Webpages - Cleaner infobox data - 10 000 additional RDF links into the Geonames database. - 9000 new RDF links between books in dbpedia and data about them provided by the RDF Book Mashup - 200 RDF links between computer scientists in dbpedia and their publications in the DBLP database - New classification information for geographic places using dbpedia terms and Geonames feature codes Lots of thanks to Georgi, Sören, Nikolai and Marc from Geonames for making this possible :-) Kingsley, Orri: Could you please load the new dataset into Virtuoso. Orri: The new dataset includes lots of free text (extended abstracts) so you have nice input for your new free text search feature. After you have indexed the stuff, I would love to put a note or a link on the dbpedia site explaining how to use the free text search. Could you please send me some text or an appropriate link. I was having a beer with Andreas Harth from DERI yesterday and he complained that the crawler of his Semantic Web Search Engine ( as it dereferences a couple of URIs per second. I told him that this shouldn't be a problem with dbpedia and we agreed that he would do a test crawl once the new data is loaded. Richard: Is there any progress with transferringthe dbpedia domain from our server to the Openlink server? I think having this done would be important for Andreas crawl and also help with DISCO's concurrence problems. Nice to see dbpedia developing so well :-) Cheers Chris uChris Bizer wrote: Yes. Tim/Mitko: Note the new dataset upload request. Kingsley" "Caching headers" "uDear all, I did a survey on the prevalence of HTTP cache headers on SPARQL Endpoint responses, and reported the results at ISWC last week. They weren't too good, even though quite a few expose them. As Webarch says \"Dereferencing a URI has a (potentially significant) cost in computing and bandwidth resources, may have security implications, and may impose significant latency on the dereferencing application. Dereferencing URIs should be avoided except when necessary.\" Since, the endpoint doesn't give any guideance to the application in terms of headers, it has to be derefenced, often needlessly. Since SPARQL is fairly heavy to evaluate, I'm sure you are very aware of the costs. Caching could most certainly be exploited for many SPARQL queries, especially for DBPedia, since the database changes infrequently. I previously posted an email to the developer's mailing list, but Ted informed me this would be a better forum. So, here's what should be done, in order of priority: 1) Set a Cache-Control header, e.g.: Cache-Control: max-age=2678400 This says the response is fresh for a month after the request. The value should be as long as possible, but not longer. :-) Meaning, if you know that you will not release another release in the next 8 months, you should add the number corrresponding to how many seconds there are in 8 months. Nearing the release, the value must be reduced. I suggest a month is a reasonable default figure, it will enable a lot of caching, but also allow you to plan updates within a month. Once the release date is known, it should count down towards that. At that point, it should also probably add must-revalidate also. 2) Set the Last-Modified header. The Last-Modified header is simply when the last DBPedia release was made. 3) Be able to respond with a 304 HTTP code if the client asks for revalidation with If-Modified-Since. Now, I don't know how dbpedia.org/sparql is configured, beyond that it runs Virtuoso and is maintained by Openlink. If you tell me a bit about how HTTP headers in general are added to the setup, I might be able to help with the specifics. If the above is done, it would also enable the use of a \"HTTP accelerator\", such as Varnish: as well. It would also enable institutional caching proxies, so that when everyone at Insight has to use DBPedia to launch the same queries meet their deadlines, the request doesn't need to go all the way to DBPedia. ;-) I've just started my research in this area. The rest of the Web uses this extensively, and it is quite likely that the above simple measures would benefit significantly, but my ambition is to provide more extensive benefits, both in terms of taking load off endpoints and to speed up client-percieved responses, based on prefetching. So, it'll help my research too :-) Cheers, Kjetil uOn 10/30/14 6:41 PM, Kjetil Kjernsmo wrote: Kjetil, This is a trivial matter. What isn't always trivial is receipt of constructive feedback (like this) from folks using the service. Ditto, our penchant for sometimes forgetting to atomize settings in our VAD packages. For instance, our VAD package should setup a DBpedia instance to do this by default, rather than having to remember these settings each time we reconstitute a new static instance, as we just done re. DBpedia 3.10 (a/k/a DBpedia 2014). Thanks for the reminder-feedback, so to speak. I am sure this will be in place by tomorrow, early next week, other workload permitting. Also note, we've been in a bit of a transitory state, in regards to Virtuoso releases, and this has impacted DBpedia endpoint stability in recent times. We are getting to the end of that cycle too. Kingsley uHi Kjetil, do you know if any major programming frameworks like Jena provide built-in support for exploiting those HTTP headers? This would probably speed up adoption by the average SW app developer out there. Best, Heiko Am 30.10.2014 um 23:41 schrieb Kjetil Kjernsmo: uOn Friday 31. October 2014 10.07.17 Heiko Paulheim wrote: Not the really major, as far as I know. However, the Perl library RDF::Trine supports sending in a custom User Agent, and several User Agents support caching. I didn't like any of them though, so I'm writing my own. Yeah. Kjetil uOn Friday 31. October 2014 00.08.13 Kingsley Idehen wrote: Awesome! I'll postpone my next survey till I see it in place :-) Cheers, Kjetil uOn 10/31/14 8:12 AM, Kjetil Kjernsmo wrote: Give us about a week or so. We just need to close out some Virtuoso release issues, which are leading to lots of confusion (right now) about the DBpedia SPARQL endpoint and Linked Data deployment, in regards to high availability etc. These issues will be packed into the Virtuoso VAD for DBpedia such that, post installation, any Virtuoso based DBpedia endpoint will have these in place. Much better that having to remember to set these up manually, following system upgrades and rebuilds etc Thanks again, for bringing this to our attention!" "DBPedia Ontologies" "uHi I need a list of all ontologies used in DBPedia exept DBPedia.owl. Also with the related version, For instance, which version of YAGO has been used. How can I find that? Thanks Best Regards uHi, links to other ontologies are listed on the download page under 3. Extended Datasets: The links to the Yago ontology refer to YAGO2 [1]. Freebase links have also been updated in the latest release. Best regards, Max [1] On Mon, Feb 28, 2011 at 11:42, Saeedeh Shekarpour < > wrote: uI see, thank you. So I understand, DBPedia doesn't provide a complete taxonomy/ontology of its own, only mapped concepts to others? e.g. refers owl:sameAs to a Cyc concept, but does this mean that it has no rdfs:subClassOf in the DBPedia namespace? If not, I will have to load all the external ontologies, yes? I'm hoping to get the best of both worlds, where DBPedia provides a complete ontology also with mappings to others I may not use. thanks, Darren On Mon, 28 Feb 2011 12:01:27 +0100, Max Jakob < > wrote: u uMon, Feb 28, 2011 at 14:00, < > wrote: DBpedia has an ontology hierarchy of it own [1,2]. In this ontology, poker player is a subclass of person: dbpedia-owl:PokerPlayer rdfs:subClassOf dbpedia-owl:Person But the DBpedia hierarchy is not necessarily equal to the one of Cyc. In fact, in Cyc, poker player is a subclass of athlete (see So I guess it all depends on which ontology hierarchy you want to use. I hope I understood you problem right and was able help. Best, Max [1] [2] uOn Tue, Mar 1, 2011 at 11:32, Saeedeh Shekarpour < > wrote: Ok, I think I got you now. I extracted a list of external property namespaces from the extraction framework (org.dbpedia.extraction.ontology.OntologyNamespaces.scala): http://www.opengis.net/gml/ http://www.w3.org/2001/XMLSchema# http://purl.org/dc/elements/1.1/ http://purl.org/dc/terms/ http://www.w3.org/2004/02/skos/core# Note that the datasets under http://dbpedia.org/Downloads36#h142-1 all contain sameAs relations to the instances of the listed ontologies, but that does not mean that DBpedia shares classes or properties with these ontologies. YAGO classes are a special case here! The YAGO dataset also contains rdf:type relations to YAGO classes. Does this answer your initial question? Regards, Max" "Postcode Data" "uPresumably you want to create a linked data description of each of your videos which includes the postcode? Is the intention to put these descriptions up on Wikipedia, or will you be creating your own online resource? Richard, thanks for your advice. In response to your question then yes, Ideally what I would like to do is create a wikipedia page for postcode N19 4EH and include the video content for that postcode, along with other relevant data, but this seems a bit counter productive when considered alongside the OS resource (If I create a page called N9 4EH and include data from the OS) In laymans terms what I want to do is link the video data to existing linked data resources such as the OS, are you saying this could be done from within Wikipedia ? K uOn 11/11/2011 13:39, kevin carter wrote: I'm saying nothin' . Just trying to get a sense of what would be useful for you. One obvious point is that a Wikipedia page needs to be authoritative on the subject it is addressing. As such, if you put up a page for N19 4EH, it can't just be a vehicle for your own video, and would probably involve more work than you would want to put in. I think a better approach would be to think of embedding some Linked Data directly in your own pages: RDFa is a popular way of doing this (I'm told). This Linked Data would say \"this is a video; it depicts/islocated in postcode area X\" (where X would be the Ordnance Survey URL for the postcode in question). And maybe some other useful metadata about the video. Then you can just wait for the semantic spiders to come and find your site. Richard uRichard, I appreciate your candor. Your approach sounds good. Currently the video is hosted on a proprietory video/web server and there is a requirement to move this. My plan was to migrate the video content to Wikipedia under a Public Realm License, create a set of linked Wikipedia pages which would then be included within the next DBpedia grab. So a better plan might be, move the video archive to a server within the public realm (suggestions ?) then build pages around the content that includes RDF/Metadata, and as you say wait for the bots. In this way side stepping the need to submit content via Wikipedia ? As for promoting my own video yep WP are rightly very clear on that. The point about the project is I did not make any of the videos, I just set up the software tools/server and invited people to record their own views on themselves, their community etc, so in that sense I hope I doesn't feel to much like self promotion, more an attempt to engage with the possibilities of community based technologies. Cheers for your time, K On Fri, Nov 11, 2011 at 2:01 PM, Richard Light < > wrote:" "Search uri by foaf:page" "uHello! I need to search DbPedia uri by wikipedia link (entered by user). For example: SELECT ?uri WHERE { ?uri foaf:page } But I need to make this query case-insensitive (for the case when, for example, user enters \"/moon\" instead of \"/Moon\"). Here is what I tried: SELECT ?uri WHERE { ?uri foaf:page ?url FILTER regex(str(?url), \"^ } But it is obviously very and very slow. After that I tried to use Virtuoso-specific full-text search: SELECT ?uri WHERE { ?uri foaf:page ?url . ?url bif:contains \" } but as foaf:page values are stored as Uri it doesn't work. I am pretty sure I am not the first who tries to solve this task. Are there any other ways? Regards, Alexander Hello! I need to search DbPedia uri by wikipedia link (entered by user). For example: SELECT ?uri WHERE { ?uri foaf:page < Alexander uOn 7/3/2011 1:40 AM, Alexander Sidorov wrote: Unfortunately case actually means something in Wikipedia, unlike Windows, which plays both sides of the road. Last time I checked there were about 10,000 entries in Wikipedia that differed only by case. In your case, you might (almost) get the results you want by searching on the rdfs:label field (which gives the title) but the real treasure trove of alternate names is under the dbpedia-owl:wikiPageRedirects field which doesn't (I think) get materialized as text. Note that these alternate names are often too embarrassing to display but OK to accept for searching. DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" On 7/3/2011 1:40 AM, Alexander Sidorov wrote: Hello! I need to search DbPedia uri by wikipedia link (entered by user). For example: SELECT ?uri WHERE { ?uri foaf:page < case when, for example, user enters \"/moon\" instead of \"/Moon\"). Here is what I tried: Unfortunately case actually means something in Wikipedia, unlike Windows, which plays both sides of the road. Last time I checked there were about 10,000 entries in Wikipedia that differed only by case. In your case, you might (almost) get the results you want by searching on the rdfs:label field (which gives the title) but the real treasure trove of alternate names is under the dbpedia-owl:wikiPageRedirects field which doesn't (I think) get materialized as text. Note that these alternate names are often too embarrassing to display but OK to accept for searching. uOn Sun, Jul 3, 2011 at 07:40, Alexander Sidorov < > wrote: You can try to play around with the DBpedia Lookup Service and see if it helps your task: Cheers, Max uHi Max, Unfortunately I didn't find a way to search by wikipedia link using dbpedia.org/lookup. Regards, Alexander 2011/7/6 Max Jakob < > Hi Max, Unfortunately I didn't find a way to search by wikipedia link using dbpedia.org/lookup . Regards, Alexander 2011/7/6 Max Jakob < > On Sun, Jul 3, 2011 at 07:40, Alexander Sidorov < > wrote: > I need to search DbPedia uri by wikipedia link (entered by user). For > example: > SELECT ?uri WHERE > { >    ?uri foaf:page < Max" "Split the articles dataset" "uHi all, I think we should split the articles dataset: Articles (224 MB, 12 Mtriple) - descriptions of all 1.6 million concepts within the English version of Wikipeida with English titles and English abstracts, thumbnails, Wikipedia links. This is the dbpedia basis file which should be loaded into each dbpedia repository. Some people just might want the titles, the abstracts, the thumbnails or the links. The titles might stay the primary dataset. I think to require people to load GB of data creates a severe hurdle. u uSören, On 14 Mar 2007, at 15:32, Sören Auer wrote: Good point. What do you think, is the bigger hurdle the pure size of the dump in bytes, or is it the number of triples? If it's the size, then separating the English abstracts out into a separate dump should help a lot, they are the 20% of the triples that cause 80% of the byte size (very rough guess). Richard uRichard Cyganiak wrote: For different people/systems, different characteristics of the datasets might be the hurdle. Thats why I proposed to split it into four: 1. titles, 2. abstracts, 3. thumbnails, 4. links Concatenating the files is not a problem, so the more smaller datasets we have the better, if you want to do the same for different languages I suggest a matrix (table) with the download links." "Setup Local DBpedia Version" "uHello everybody, Are there any tutorial on how to setup a DBpedia instance on your local machine ? Thank you. Cheers, Ahmed. uHave a look here: Best regards, -Mariano On Sun, Apr 21, 2013 at 11:39 PM, Ahmed Ktob < > wrote: uHello Mariano, Thank you so much. Best regards, Ahmed. On 21 April 2013 23:17, Mariano Rico < > wrote:" "Questionable DBO-DOLCE equivalence mappings" "uHi all, I've recognized that in the latest DBpedia ontology (and maybe also in previous versions), there are a few questionable equivalence statements, as listed below. Note that, among other problems, those axioms imply that each MusicGenre is a GovernmentType and vice versa, that each Year is a Holiday and vice versa, etc. Moreover, some classes, such as MusicGenre, become unsatisfiable when the DUL/D0 ontologies are used in a reasoner. How did those mappings get in? Shouldn't we replace them with subClassOf axioms? Best, Heiko dbo:Ideology owl:equivalentClass d0:CognitiveEntity dbo:Monastery owl:equivalentClass d0:Location dbo:Tax owl:equivalentClass dul:Description dbo:GovernmentType owl:equivalentClass dul:Concept dbo:Sales owl:equivalentClass dul:Situation dbo:Database owl:equivalentClass dul:InformationObject dbo:Holiday owl:equivalentClass dul:TimeInterval dbo:MeanOfTransportation owl:equivalentClass dul:DesignedArtifact dbo:UnitOfWork owl:equivalentClass dul:Situation dbo:PenaltyShootOut owl:equivalentClass dul:Event dbo:MusicGenre owl:equivalentClass dul:Concept dbo:Year owl:equivalentClass dul:TimeInterval dbo:Food owl:equivalentClass dul:FunctionalSubstance dbo:Project owl:equivalentClass dul:PlanExecution dbo:List owl:equivalentClass dul:Collection dbo:Organisation owl:equivalentClass dul:SocialPerson dbo:Unknown owl:equivalentClass dul:Entity dbo:Polyhedron owl:equivalentClass dul:SpaceRegion dbo:LegalCase owl:equivalentClass dul:Situation uHi, thanks Heiko for spotting this. BTW in the original mappings for version 3.9 those axioms were all subClassOf. Probably something has gone weird in the versioning. My proposal, which I made two years ago, is that, since DOLCE mappings have the main goal to help integrity checking or top-level querying, they should be revised shortly before each new DBpedia ontology release, so that no incoherences are accidentally introduced in the release. In fact, even with 3.9, some changes were made to the ontology after I sent out the mappings, and some (just a few) axioms, which were defined with unions, were removed, probably because unions were not intended to be enforced in DBpedia. In absence of a clear management of top level mappings, I still hesitate to update them, although there is a clear need to do it. Cheers Aldo u(cc'ing the DBpedia ontology list) Thank you Heiko for identifying the inconsistencies a big thank to Aldo for taking the effort of creating the original mappings. personally, I am not aware of any changes to the mappings (subClassOF / equivalentClass/Property) but I was not so involved in the releases before 2015-04 Markus and I have the overview of the current releases but we are not dolce experts and not sure how to proceed here. They way I see it we need 2 things here: 1) identifying the errors & 2) fixing them looks like Heiko has something wrt (1) that we can reuse for the following release. We need help with (2) though. If the mapping flexibility is an issue I would suggest to keep these mappings out of the mappings wiki. We plan to move the ontology out of the wiki as well when we switch to RML so it is something we will do anyway Any other suggestions / feedback? Cheers, Dimitris On Wed, Oct 26, 2016 at 6:35 AM, Aldo Gangemi < > wrote:" "dbpedia uri encoding policy: conflating URIrefs and URIs?" "uHi folks, (apologies in advance for a rather longwinded post) I have a comment regarding the URI encoding policy as defined at Specifically, it says this: 'All other characters are unsafe and are first converted into one or more bytes using UTF-8 encoding. Then each byte is represented by the 3-character string \"%xy\", where xy is the two-digit hexadecimal representation of the byte.' This policy results in DBPedia RDF containing URIrefs like: This is a valid URI, but it seems to me that DBpedia's definition of 'unsafe characters' is slightly too broad, and more importantly, that the current policy is jumbling the notions of URIs and URI /references/. RDF uses URI references[1]. These should not contain %-encoded chars themselves, but are defined to be valid Unicode strings that would result in a valid URI *after encoding is applied*. How 'unsafe' characters in URIrefs should be escaped or encoded is a feature of a particular RDF serialization format. For example, in N-Triples all URIref chars that are not in the US-ASCII range are encoded using \-escape sequences (see [2]). However, since '(' and ')' are in fact US-ASCII chars, there is no need to encode/escape these chars in a URIref in N-Triples (nor, AFAIK, in any other of the more common RDF serialization formats). The reason I am raising this point is that there may be a real problem when clients read the RDF produced by DBPedia: RDF processors, if spec-conformant, assume that what they get back is a URI reference, and that they may need to %-encode it to produce a valid URI. However, since DBPedia produces URIrefs that already contain %-encodings, such a processor would apply %-encoding twice (since % itself is a character that needs to be encoded): In other words: an RDF processor will, under the current policy, not be able to distinguish between a URI containing a %-encoded character and a URIref containing the %-character itself. To make a long story short: I believe the DBPedia RDF should contain URIrefs like this: and it should leave the %-encoding up to the client. Regards, Jeen [1] [2] #sec-uri-encoding uHi Jeen, you raise a point that many DBpedians have on their minds. Double-encoded percent signs and decreased readability are popular arguments for %-encoding less characters. Especially because some of them are in fact not \"unsafe\" (I deleted this statement on [1]). However, as stated on the referenced page, the encoding of URIs is modeled after the encoding of Wikipedia identifiers. And I am not sure if diverging from this scheme is a good idea. Cheers, Max [1] On Tue, Aug 9, 2011 at 01:03, Jeen Broekstra < > wrote: uOn 12/08/11 23:43, Max Jakob wrote: Of course, if changing the policy would break compatibility with Wikipedia, that would indeed be bad. But consider this: the Wikipedia identifier encoding scheme is about encoding URIs, not URI *references*. The change I propose would still be compatible with that, all I'm saying is that the things that appear in the DBPedia RDF should be URI references, not (%-encoded) URIs. If anything, it would be _more_ compatible with Wikipedia that way, not less. Regards, Jeen" "DBpedia content not valid RDF/XML" "uHi, all of these errors stem from the problem that not all RDF triples can be represented in RDF/XML. [1] (IMHO, a shortcoming in the RDF/XML spec that could easily have been fixed by introducing something like , similar to .) As Jeen Broekstra wrote on dbpedia-discussion in August 2011 [2]: \"The only reliable way around the problem is to use a serialization format that does cope with all legal RDF properly, such as N-Triples or Turtle.\" But still, when someone really wants RDF/XML, what should Virtuoso do with triples that can't be serialized? In some cases, there actually is a possible representation. For example, the property URI could be represented as Weird and confusing for humans, no problem for computers. In those cases that can't be represented in RDF/XML, the spec says 'throw a \"this graph cannot be serialized in RDF/XML\" exception or error' [1]. Probably not a good solution for us. I think Virtuoso should omit such triples from RDF/XML, but include something like a comment in their place that they were omitted and are available in other formats (like NTriples). Regards, Christopher [1] [2] On Sun, Feb 26, 2012 at 21:02, Axel Polleres < > wrote: uHi, I just found an old thread [1] that mentions another solution - append an underscore to property URIs that otherwise cannot be serialized as RDF/XML. I think DBpedia should do that for the few URIs that cause problems, but not too many, e.g. shouldn't be changed. See \"I18n of Semantic Web Applications\" [2] by Auer et al for a thorough investigation. Christopher [1] [2] On Feb 28, 2012 6:20 PM, \"Jona Christopher Sahnwaldt\" < > wrote:" "Mapping Statistics (Questions arised from the mapping marathon)" "uHi Mariano, I don't have answers for everything, but here goes my 2c. (split by subject) STATISTICS I don't know the answer. Paul Kreis is possibly the only one that would know. is there any way for knowing which username created more mappings? Yes. We do that for the DBpedia Portuguese. See pt.dbpedia.org. I'm glad to share the code. What is the meaning of the grey rows in the statistics page? It says The answer is here: \"the statistics contain non relevant templates like Unreferenced or Rail line. These templates aren't classical infoboxes and shouldn't affect the statistics. On that account they can be ignored. If a template is on the ignore list, it does not count for the number of potential infoboxes.\" Folks, anybody else can chip in? Cheers, Pablo" "total number of triples and instances in current DBpedia version 3.9" "uHi, I want to know the correct number of instances and total triples for theEnglish version of DBpedia 3.9. I have come across the DBpedia statistics page ( The reason being, I read in a paper that they mentioned in DBpedia version 3.4, it had 3.5 entities (instances) with 672 million triples. Having that in mind, DBpedia statistics page says that version 3.9 has 4 million (4,004,478) instances and 70 million (70,147,399) raw statements. Recent DBpedia paper ( Couls somebody clarify the correct numbers for me for both English version and whole DBpedia. Total number of things (instances) and total number of triples. Thank you very much. uHi, As it's me who have done the statistics you mention, let me try to clarify. [1] and [2] are based on DBpedia dumps for 3.9 and 3.8, respectively. The last DBpedia paper has the numbers for 3.8 - and in the statistics page for 3.8 [2] you indeed find 3.7 mln entities. Why is \"400 mln triples\" not there? Because [2] counts *just* raw property statements extracted from infoboxes (65 mln), type statements (13.7 mln) and mapped (to DBpedia ontology) property statements (33.7 mln). It does not count however many other triples: those coming from inter-language links, abstracts, categories, links to other resources and so on, check the download pages for the whole list [3,4]. If you count all these, perhaps, you'll arrive at 400 mln triples. In fact, SELECT COUNT(*) WHERE {?x ?y ?z} executed against DBpedia SPARQL endpoint returns 825,761,509 at the moment. And actually I am not sure that all datasets available at [5] are loaded into the endpoint, so the total number for English can be even bigger. Summarizing, [1,2] are good sources for getting numbers of things/instances. For the number of triples - depends on what you want to count. For types and properties refer to [1,2], for total number of triples - refer to SPARQL endpoints for English and some other languages for which the endpoints exist. Or go through the dumps and count :) Cheers, Volha [1] [2] [3] [4] [5] On 4/19/2014 11:59 PM, Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva wrote: uA few minor additions: On 20 April 2014 18:58, Volha Bryl < > wrote: No, only certain datasets are loaded. They are listed here: The number of lines in each dataset file is listed in this file: There are a few comment lines in each file, so the number of triples is slightly lower, but not by much. I just counted the lines in all English NT files by the following command. (grep -v is necessary to remove a few files that contain almost the same triples as other files.) grep 'en/.*\.nt' lines-bytes-packed.txt | grep -vE 'unredirected|same_as|see_also|chapters|cleaned' | awk '{sum+=$3} END {print sum}' Result for en: 488 million triples. For all languages: 3.1 billion triples Regards, JC uHi Christopher, A curiosity: On 4/21/2014 3:05 AM, Jona Christopher Sahnwaldt wrote: Why then the triple count according to the endpoint (see the query above) is more than 800 mln? From your explanations (not all triples are loaded) it should be the other way around. Cheers, Volha uThank you very much for the clarification. I think if I want to get total number of approximate triples for DBpedia English version, I should get it from the endpoint with the query suggested. From: Volha Bryl < > Sent: Sunday, April 20, 2014 12:58 PM To: Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva; Subject: Re: [Dbpedia-discussion] total number of triples and instances in current DBpedia version 3.9 Hi, As it's me who have done the statistics you mention, let me try to clarify. [1] and [2] are based on DBpedia dumps for 3.9 and 3.8, respectively. The last DBpedia paper has the numbers for 3.8 - and in the statistics page for 3.8 [2] you indeed find 3.7 mln entities. Why is \"400 mln triples\" not there? Because [2] counts *just* raw property statements extracted from infoboxes (65 mln), type statements (13.7 mln) and mapped (to DBpedia ontology) property statements (33.7 mln). It does not count however many other triples: those coming from inter-language links, abstracts, categories, links to other resources and so on, check the download pages for the whole list [3,4]. If you count all these, perhaps, you'll arrive at 400 mln triples. In fact, SELECT COUNT(*) WHERE {?x ?y ?z} executed against DBpedia SPARQL endpoint returns 825,761,509 at the moment. And actually I am not sure that all datasets available at [5] are loaded into the endpoint, so the total number for English can be even bigger. Summarizing, [1,2] are good sources for getting numbers of things/instances. For the number of triples - depends on what you want to count. For types and properties refer to [1,2], for total number of triples - refer to SPARQL endpoints for English and some other languages for which the endpoints exist. Or go through the dumps and count :) Cheers, Volha [1] [2] [3] [4] [5] On 4/19/2014 11:59 PM, Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva wrote: Hi, I want to know the correct number of instances and total triples for theEnglish version of DBpedia 3.9. I have come across the DBpedia statistics page ( The reason being, I read in a paper that they mentioned in DBpedia version 3.4, it had 3.5 entities (instances) with 672 million triples. Having that in mind, DBpedia statistics page says that version 3.9 has 4 million (4,004,478) instances and 70 million (70,147,399) raw statements. Recent DBpedia paper ( Couls somebody clarify the correct numbers for me for both English version and whole DBpedia. Total number of things (instances) and total number of triples. Thank you very much. uOn 21 April 2014 12:02, Volha Bryl < > wrote: Good question. I dont' know. The number of lines in all files listed in DatasetsLoaded39 [1] (same files as in datasets.txt [2] and linksets.txt [3]) is 341,542,042 - not even half the number given by COUNT(*). @OpenLink: can you help? Maybe you guys added some other datasets or inferred a lot of triples when you loaded the DBpedia datasets? Just curious. Details: cat datasets.txt linksets.txt > loaded.txt grep -f loaded.txt lines-bytes-packed.txt | awk '{sum+=$3} END {print sum}' Cheers, JC [1] [2] [3]" "Easiest way to translate mappings?" "uHello again, I've registered at mappings.dbpedia.org with the account name dbchris and would request editor rights, if not already given. Sinve I'm basically interested in translating the mappings I would like to know whats the best way to do that. So far I could only search for specific mappings I want to edit, but ideally I would like to search for all mappings that have'nt been translated yet. If thats not possible I'll just edit the dbpedia-3.7.owl. Can I upload the file afterwards anywhere? uHi Chris, a big welcome to DBpedia! I just gave you editor rights. I guess that by translating you mean the labels for classes, properties and datatypes. There currently is no list of missing labels. It would be relatively easy to add a page to the DBpedia server (the one that extracts samples, displays the statistics etc.) that displays missing labels for a certain language. Maybe just an hour or two of workjust go through all ontology items, check if they have a label for the desired language, and display a link to their wiki pages. I won't have time to do that in the next two weeks or so (maybe later). But if you know a little Scala, you can easily do it yourself, and I'll gladly help you with any questions. The ontology and the labels have to be edited directly in the wiki. There is no way to edit the OWL file and upload the result to the mappings wiki, and it would be very hard to add that feature. You can easily download all ontology items from the wiki to one file and edit that file, but you can't upload the result, so that won't help either. It would probably be possible to somehow allow uploading that file, but while not as hard as handling the OWL file it's still a good deal of work and has some problems, so I don't think we're going to implement that. Cheers, JC On Sat, May 19, 2012 at 3:04 PM, Christoph Lauer < > wrote: uHi Chris, I added pages listing all missing labels. For example: I saw that you already added over a hundred German labels to the mappings wiki. Wow! Good job. Thanks for your work! It's great to see how many people are contributing to DBpedia. :-) By the way, new labels and comments should preferably be added in the format labels = {{label|de|}} We're trying to move away from the old format rdfs: = See for a few more details. Again, thanks a lot for your contributions! Cheers, JC On Sat, May 19, 2012 at 3:55 PM, Jona Christopher Sahnwaldt < > wrote:" "'Multilingual' dbpedia URIs 200 OK and URIs" "uHello Ed, Richard Interesting. Suppose you define at some point the URI What would be the relationship between this URI and One could assert apparently safely the following, even if Pat is monitoring the list ;-) owl:sameAs But in this case, what would be the use of this new URI? Others can imagine that French-speaking and English-speaking wikipedians are not exactly tuned on what a Library or Bibliothèque is. For example the categories in the two wikipedias might differ or even be mutually inconsistent (not to mention separately, which is another story). Of course the 'multilingual' URIs make sense for things defined in a non-english WP, but not in the english WP. Are there any plans in dbpedia team to integrate such 'multilingual' URIs (so to speak) in a foreseeable future? Bernard uOn 11 Aug 2009, at 23:02, Bernard Vatant wrote: The idea has been floating around for quite some time, but AFAIK there are no resources allocated to it, and there are a large number of more pressing issues. I mentioned it just to reassure the OP about the stability of the English-language URIs. Best, Richard" "Geo coordinates in German dbpedia" "uHi! While browsing the german dbpedia I found that all the coordinates are rounded to integers, which renders them basically useless. Someone already mentioned this in the german dbpedia mailing list some month ago but since that list is not very active i thought i just ask here. Is there anyway I could fix that or are people aware of this problem and fixing it already? Thanks a lot,umut Hi! While browsing the german dbpedia I found that all the coordinates are rounded to integers, which renders them basically useless. Someone already mentioned this in the german dbpedia mailing list some month ago but since that list is not very active i thought i just ask here. Is there anyway I could fix that or are people aware of this problem and fixing it already? Thanks a lot, umut" "Out of memory sparql endpoint" "uHi, Hope someone can help me please: when I use the following query I get an SQ200. Looks like the mem pool is 20mb in size. Should I expect any better, or is there some way I can adjust my query to be more efficient ? 42000 Error SQ200: The memory pool size 20018584 reached the limit 20000000 bytes, try to increase the MaxMemPoolSize ini setting SPARQL query: PREFIX lew: PREFIX : PREFIX res: PREFIX geo: PREFIX dbpo: PREFIX dbprop: PREFIX w3geo: PREFIX xsd: PREFIX dbpedia: PREFIX rss: PREFIX rdf: PREFIX skos: SELECT DISTINCT ?s ?lat ?long WHERE { ?s rdf:type dbpo:PopulatedPlace . { { ?s dbprop:latd ?lat .} UNION { ?s w3geo:lat ?lat .} UNION { ?s dbprop:latitude ?lat .} } { { ?s dbprop:longd ?long .} UNION { ?s w3geo:long ?long .} UNION { ?s dbprop:longitude ?long .} } FILTER ( ( ( ( ?lat >= 50.0 ) && ( ?lat <= 55.0 ) ) && ( ?long >= -9.0 ) ) && ( ?long <= 0.0 ) ) { { ?s dbprop:name \"Bangor\"@en .} UNION { ?s dbprop:officialName \"Bangor\"@en .} } OPTIONAL { ?s dbprop:province :Connacht ; dbprop:county :County_Mayo ; dbprop:subdivisionName :County_Mayo ; dbprop:country \"Ireland\"@en ; dbprop:country \"Ireland \"@en ; dbprop:country \"Republic of Ireland\"@en . } } LIMIT 1 TIA Hi, Hope someone can help me please: when I use TIA" "nlp data" "uDear list members, it seems that the nlp data in the dbpedia downloads section is unavailable. I think there is a open issue in the github repo on that. Is there a link that I can download it from and /or any documentation on how this data can be used ? All the best, Thiago" "Existing entities vs General entities" "uHello, for research purposes I'm interested in differentiating DBpedia entities into two types, those which are actual existing elements (i.e., things you can see and touch) and generalizations of elements (i.e., things which are abstractions of existing elements). Examples of the first ones could be: John Turturro, the Golden Gate Bridge, the Enola Gay bomber. Examples of generalizations could be: Football, the femur, Boeing B-29 Superfortress bomber. After reading some documentation on the DBpedia, including the latest article published, it looks to me that such difference is never made. Furthermore I wonder if it is even possible to make that difference based on the information available. Unfortunately, I do not know enough about the Wikipedia semantics to answer that. The only solution I can think of is manually tagging entities. That could be facilitated by grouping elements (e.g., every entity of class Person is an existing entity). However, other classes would require individual treatment. So my questions are these: -Is there a difference in DBpedia between existing entities and general entities? -Is there information available in the Wikipedia to make such difference? -Based on the DBpedia, is there any other method beyond manual tagging to make that difference? -Of the DBpedia Ontology, which classes could be considered as holding existing entities? Person, Place, Planet, Work, ? I know is quite an abstract question, and not fully related with technical aspects of the DBpedia, but I think this is the place to ask. Thank you all for your time, Dario. uOn 11/2/2011 1:38 PM, Dario wrote: This is a great research topic, but it's also of considerable commercial importance. If one were interested in converting DBpedia facts to text or creating a user interface, it would be good to know about \"abstract\" vs \"concrete\" There is the possibility of defining classes (:SomethingThatCanHaveAMember) or (:SomethingThatIsOnlyAnInstance) but also the possibility of defining an abstract/concrete score which is numerical. People tend to be very concrete, but we have no idea who :D.B._Cooper was or what his fate was. :Captain_Kirk is more abstract than :William_Shatner. When dealing with the more difficult stuff, a numeric score might be the best you can do. I think the strategy of starting with types and then refining the results is best. You could probably get a large majority of topics properly typed, particularly if you use type information from Freebase, which is more accurate and comprehensive than DBpedia types. The hard ones are going to be the things that fall through the cracks in the type system, like but note that Freebase has 18 types for this topic, so you're not without hope. Maybe it's a fair guess to say that \"things that fall through the cracks\" are abstract. I say: try the obvious thing with types, then do some evaluation. If you're not happy with it, maybe you'll think of another heuristic (traditional knowledge engineering) or maybe you can train a machine learning algorithm to make the distinction. Evaluate again and repeat until you've got enough for a paper or a product that's \"good enough to use\". I'd love to see a Turtle file published with these classifications because I could use them. uHi Paul, I agree. From my point of view, concrete entities have a different nature than abstract entities, which justifies the fact of using them differently. Those differences have been discussed for thousands of years (see Aristotle for example), and we should not ignore all that previous work. I may be naive in that matter, but I think that this subject is one of the keys for the success of the Semantic Web and for the field of AI in general. To be honest, I had not thought about a numeric value of abstractness before. Although it may be useful in practice (I'll have to think about that), I do not think it is 'natural'. As I see it, it would be more like a measure of ignorance. For example, your neighbor. Imagine you have never seen him, but you know he's a man because you have heard his voice. He is however a concrete entity for sure. The issue here is that your ignorance towards that entity causes you to generalize with him and treat him, in some aspects, like an abstract entity. You would assume he has two eyes, for example. That's a topic that also interests me and which falls within my PhD, since my main interest is to make knowledge discovery based on some Semantic Web. Thanks for the advice. It seems that Freebase is a little more specific in that difference than DBpedia. I'll study their data sets in detail. There are several approaches to solve this problem. Data Mining, Natural Language Processing, Machine Learning, OntologiesNon of which has worked well enough so far. My proposed solution, which I have not seen anyone else trying, uses inference methods to discover and learn about entities. I'd say its a mix of DM, ML and Ont. But I need some data to start with. Freebase or DBpedia may just be it. Hopefully in one or two years you will have your Turtle file. Dario." "ImageExtractor issue" "uHi, We have been trying to setup an instance of dbpedia to continously extract data from wikipedia dumps/updates. While going through the output we observed that the image extractor was only picking up the first image for any page. I can see commented out code present in the ImageExtractor which seems to pick all images. In place of that we have the code which returns on the first image it encounters. My questions are : 1. Does the commented out code actually works ? Does it really pick all the images on a particular page? 2. Why was the change made in the code ? Thanks and Regards Amit ImageExtractor issue Hi, We have been trying to setup an instance of dbpedia to continously extract data from wikipedia dumps/updates. While going through the output we observed that the image extractor was only picking up the first image for any page. I can see commented out code present in the ImageExtractor which seems to pick all images. In place of that we have the code which returns on the first image it encounters. My questions are : Does the commented out code actually works ? Does it really pick all the images on a particular page? Why was the change made in the code ? Thanks and Regards Amit uHi Amit, extract data from wikipedia dumps/updates. While\" We would like to do the same for the DBpedia Portuguese. If you can share any code, it would be much appreciated. Cheers Pablo On Mar 19, 2012 10:38 AM, \"Amit Kumar\" < > wrote: uHi Pablo, For the continuous extraction we are trying to setup a pipeline, which polls and downloads the Wikipedia data, passes it through DEF(Dbpedia Extraction Framework) and then create knowledgebases. Many of the plumbing is handled by Yahoo! Internal tools and platform but there are some pieces which might be useful for the Dbpedia community. I’m mentioning some below. Let me know if you think you can use anyone. if yes, I would contact our Open Source Working Group Manager to take it forward. 1. Wiki Downloader : We have two components. * Full Downloader: A basic bash script which poll the latest folder of wikipedia dumps. Check if a new dumps is available and downloads it to a dated folder. * Incremental Downloader: It includes an IRC bot which keeps listening to wikipedia IRC channel. It makes a list of files which were updated. It De-dups and downloads those pages every few hours while respecting the wikipedia QPS. 2. Def Wrapper: A bash script which invokes the DEF on the data generated by the downloader. Both these have some basic notifications and error handling. There are some stuff after DEF, but they are quite internal to Yahoo!. I think you already have a download.scala which downloads the dbpedia dumps. There were few mails in the last week about the same. If you are facing some particular issue in particular with DBpedia Portuguese, do let me know. If we have faced the same, we would let you know. Regards Amit On 3/19/12 3:45 PM, \"Pablo Mendes\" < > wrote: Hi Amit, We would like to do the same for the DBpedia Portuguese. If you can share any code, it would be much appreciated. Cheers Pablo On Mar 19, 2012 10:38 AM, \"Amit Kumar\" < > wrote: Hi, We have been trying to setup an instance of dbpedia to continously extract data from wikipedia dumps/updates. While going through the output we observed that the image extractor was only picking up the first image for any page. I can see commented out code present in the ImageExtractor which seems to pick all images. In place of that we have the code which returns on the first image it encounters. My questions are : 1. Does the commented out code actually works ? Does it really pick all the images on a particular page? 2. Why was the change made in the code ? Thanks and Regards Amit uHi Pablo, Amit, Although I didn't write the image extractor, I think that this is more a matter of semantics than technical and it was left this way intentionally. The first picture is usually the most representative of the article and thus we use foaf:depiction. Other pictures might not be about the article itself, but for other closely related articles i.e. [1,2,3] so it might not be the best approach to extract them all. We do have the download.scala for wikipedia dumps, but the DIEF Wrapper would be great! One script to invoke them all ;) Anyway, I guess you have a reason to do it this way, but the DBpedia live approach could make more sense for a continuous integration. Cheers, Dimitris [1] [2] [3] Hi Pablo, Amit, Although I didn't write the image extractor, I think that this is more a matter of semantics than technical and it was left this way intentionally. The first picture is usually the most representative of the article and thus we use foaf:depiction. Other pictures might not be about the article itself, but for other closely related articles i.e. [1,2,3] so it might not be the best approach to extract them all. Wiki Downloader : We have two components. Full Downloader: A basic bash script which poll the latest folder of wikipedia dumps. Check if a new dumps is available and downloads it to a dated folder. Incremental Downloader: It includes  an IRC bot which keeps listening to wikipedia IRC channel. It makes a list of files which were updated. It De-dups  and downloads those pages every few hours while respecting the wikipedia QPS. Def Wrapper: A bash script which invokes the DEF on the data generated by the downloader. We do have the download.scala for wikipedia dumps, but the DIEF Wrapper would be great! One script to invoke them all ;) Anyway, I guess you have a reason to do it this way, but the DBpedia live approach could make more sense for a continuous integration. Cheers, Dimitris [1] Coptic_Orthodox_Church_of_Alexandria uAmit, Both sound great! We'd love to have them contributed to the project. Cheers, Pablo On Mon, Mar 19, 2012 at 11:45 AM, Amit Kumar < > wrote: uHi all, we also wrote bash scripts that download the latest wikipedia dumps [1][2] and import them into a database [3]. I wasn't around when we switched from bash to Scala, but I guess it was because we wanted code that can also run on Windows. Regards, JC [1] [2] [3] On Mon, Mar 19, 2012 at 11:45, Amit Kumar < > wrote:" "Wikidata granularity and Linked Data" "uOn 3/10/14 3:08 PM, Daniel Kinzler wrote: Daniel, In the light of your comments above, why can't Wikidata, Freebase, and DBpedia cross-reference is a natural way. All I see right now are the following, which I find suboptimal, for all the wrong reasons: 1. Wikidata cross references Freebase 2. Freebase cross references Wikidata 3. DBpedia cross references both, but makes bridge URIs in the DBpedia namespace to achieve this, in regards to Wikidata. 1-3 isn't how this should be working out on the Web of Linked Data. Links: [1] 1lZyKy3" "Mappings submission failure" "uHi everyone, I'm experiencing problems when trying to save mappings in the wiki. Getting the following error: Sorry. If you can read this, saving the mapping was not successful. Please hit the 'Back' button in your browser and try again! If this happens repeatedly, please contact the DBepdia developers. (curl_exec hung itself in extensions/Ultrapedia-API/Ultrapedia-API.class.php->onArticleSave; probably DBpedia server took to long to answer the PUT) AAA Maybe too much mapping sprint? :-) Cheers, Marco" "DBPedia Live - changesets and dumps" "uhi All, Since early this month I noticed that the change sets (live updates) for DBPedia live have not been updated in this URL - Also no new dumps have been generated since the last one which was at the end of July. Do you know whats happening on this front and any idea about if/when these feeds will be back to normal? Thanks, Deepak hi All, Since early this month I noticed that the change sets (live updates) for DBPedia live have not been updated in this URL - Deepak uHi Deepak, On 09/14/2012 01:50 PM, Prav Prav wrote: Actually there are changesets up to September 3rd. We create a dump every month, but it seems that I forgot to generate the one of August. Sorry for that, and I'll generate a new one ASAP. We were moving to a new building and we had to move our servers as well. We had also a problem with our local wiki, but fortunately it is fixed now. Sorry for any inconvenience." "schema.org ontology property not found error when editing mappings" "uHi guys, I've just added a mapping [1] in the wiki and I got validation errors like this: Couldn't load property mapping on page title=Box successione;ns=212/Mapping it/Mapping it;language:wiki=en,locale=en. Details: Ontology property not found: schema:duration I'm actually using a property coming from schema.org ontology. Validation works well when using foaf ontology for example. Any hints? Thank you in advance. Cheers, Marco [1] Mapping_it:Box_successione uOn 2 May 2012 18:10, Marco Fossati < > wrote: Reading around your link, \"Mapping Validator. When you are editing a mapping, there is a validate button on the bottom of the page. Pressing the button validates your changes for syntactic correctness and highlights inconsistencies such as missing property definitions.\" I suspect this is because (over on schema.org itself) we're not yet publishing RDF/S property descriptions at the URIs associated with each property. We'll get there, but it's not yet happening uOn 2 May 2012 17:10, Marco Fossati < > wrote: Properties like foaf:name have entries in the wiki (e.g., schema:duration doesn't, hence the error. Here, you would ideally use: {{DateIntervalMapping | templateProperty = periodo | startDateOntologyProperty = activeYearsStartYear | endDateOntologyProperty = activeYearsEndYear }} but the date interval parser would need some changes for language-specific intervals (that is, it expects things like '[[1900]] - [[1910]]', and won't catch things like 'dal [[1900]] al [[1910]]') uFirst of all, thanks for the quick answer. On 5/2/12 8:23 PM, Jimmy O'Regan wrote: OK, so should I add schema:duration in the wiki? I will investigate this solution. Cheers, Marco u+1 On Thu, May 3, 2012 at 11:42 AM, Marco Fossati < > wrote: uHi all, I just remembered there already is a similar DBpedia property: runtime [1][2]. It even declares schema:duration as owl:equivalentProperty. The name 'runtime' is less than perfect and only applies to movies etc, 'duration' is more general. For example, schema.org/Movie also uses schema:duration. Maybe we should have renamed 'runtime' to 'duration' instead? While it's on our wiki, I just changed schema:duration from ObjectProperty to DatatypeProperty. It has range schema.org/Duration, which because of the capital D looks like a class, but is actually a datatype that looks equivalent to xsd:duration. At first I liked the idea of using schema.org properties and types, but maybe we should wait until the RDF schema for schema.org is published and used. For example, schema:duration doesn't have a dereferencable URI yet. Regards, Christopher [1] [2] On Thu, May 3, 2012 at 1:49 PM, Jona Christopher Sahnwaldt < > wrote: uHi Christopher (or do you prefer Jona? :-) ), On 5/4/12 11:23 AM, Jona Christopher Sahnwaldt wrote: Yep, that's the point. The property has multiple domains i.e. Event, MediaObject, Movie, MusicRecording. For example, schema.org/Movie also We could create specific properties for each domain and link them to schema:duration (maybe as subproperties). In this way, we can keep 'runtime' just for movies. But I think it would be clearer to enable multiple domain/ranges declaration in the wiki. I started a discussion in the wiki about that. There is an official OWL [1] as well as an RDFS [2], but I don't know whether they are actually used or not. For example, schema:duration doesn't have a In conclusion, the open question is: should we try to reuse existing ontology terms i.e. schema.org or add new ones under the dbpedia namespace? Cheers, Marco [1] [2]" "ontology issue" "uDear, Concerning property Its range has been defined as dbo:PopulatedPlace , while it is rather a code under which a populated place is classified. I would have expected a skos:Concept. See e.g. Kind Regards, Paul Hermans uDear Paul, thank you for the report, your observation is correct. This should be a datatype property with xsd:string as range rather than an object property. Due to this, I didn't find any data extracted with this property. I updated the property definition and should be fixed on the next DBpedia release. Best, Dimitris On Thu, Dec 22, 2016 at 3:54 PM, Paul Hermans < > wrote:" "SPARQL returning different results to those available through" "uHi all I have an issue where some SPARQL queries do not return all results that are available through endpoint I believe. Here is an example, if I go to: the page is displayed showing many literals including dbpedia-owl:abstract However, executing this sparql query: PREFIX onto: PREFIX dbpprop: PREFIX yago: PREFIX rdf: PREFIX rdfs: SELECT * WHERE { ?uri ?rel ?literal . FILTER (isLiteral(?literal)) . FILTER (?uri= ) } returns only one entry and that is rdfs:label (no dbpedia-owl:abstract or the others) Any ideas of why this is the case? Thank you in advance! Danica Hi all I have an issue where some SPARQL queries do not return all results that are available through Danica uHi Danica, if you \"go to\" mean \"use a web browser\"), you are redirected to which contains all the literals you mention. If you use that concept in your query, you'll get them all. Btw, your query is correct, but a bit awkward. You first select *all* triples in DBpedia, and then filter out those which accidentally match your subject. Why not use SELECT * WHERE { ?rel ?literal . FILTER (isLiteral(?literal)) . } Best, Heiko Am 15.08.2013 11:10, schrieb Danica Damljanovic: uThanks Heiko, very useful! Would it be correct to assume that this property will give an indication of whether the redirection happens or not: e.g. You are right about the SPARQL - it's generated automatically so it is a bit awkward and not optimised! Cheers Danica On 15 August 2013 10:45, Heiko Paulheim < >wrote:" "Newbie: Ranges, non-link values, comparison to freebase, license" "uCongrats on an exciting project! I hope it's ok as a Newbie to post a number of questions - Text values without links: I'm wondering why some values are good links but many are just text, say the residence or Alma Mater of George Bush seem to be just text - Properties without range/type: Similarly wondering why some properties have a range or type but many are missing - say why doesn't - Also in George Bush successor as governer and successor as president seem mixed together under successor - perhaps that data should be moved to separate objects like presidency and governership? - Coverage: Is there any analysis what % of topics in Wikipedia are scanned or for specific categories e.g. what % of wikipedia people articles are scanned? - License: What does it mean in Commons Attribution-ShareAlike License and the GNU Free Documentation Licens - which is it or can I choose or must I comply with both simultaneously - Wonder if anyone has done a comparison of methodology, data coverage, data quality with freebase.com Sorry for the multiple questions! Many thanks ZS Congrats on an exciting project!  I hope it's ok as a Newbie to post a number of questions - Text values without links: I'm wondering why some values are good links but many are just text, say the residence or Alma Mater of George Bush seem to be just text ZS" "Specifying topics in FOAF?" "uHi Nick, Hi John, These days the example should probably readdc:title \"My physics page\" . foaf:topic . as the topic of iand's Physics page is presumably the concept \"Physics\" rather than \"The Wikipedia Page about Physics\", which is what's implied in the original example (in the light of the TAG's httpRange-14 finding). Wikipedia URIs are URIs of documents not concepts, so if you ever want to reference a concept it's very much worth finding the equivalent dbpedia.org/resource/URI on DBpedia and linking to that. HTH, Tom." "how can i contribute to dbpedia ???" "uHi Ravinder, Georgi and Piet have restructured and improved the Dbpedia extraction code over the last weeks. Georgi is currently importing the last Wikipedia dump and we hope to be able to make a new extraction run sometime next week. The new code will also be submitted to sourceforge next week. After this new run, it would be great if you would help us to analyze the new data, check which of the current bugs are really fixed and where problems remain. Afterwards, you could work with Piet on solving the remaining extraction problems. What do you think? Best wishes from Berlin. Cheers Chris" "SPARQL from DBPedia looking for remote knowledge (from my website)" "uHello, dear sirs! I have a question, I don't understand where I can find an answer. If I want my website's knowledge to be available from SPARQL endpoint, how can I do this? 2 variants: to start SPARQL endpoint at my website (it should \"see\" my rdf-data there) or to use dbpedia SPARQL endpoint to look for knowledge at my website? I don't understand how to start. My aim is to make it possible to search my website's triples via sparql. And it's better for them to be available from dbpedia sparql endpoint. How can I connect my data to dbpedia sparql endpoint? Or, if it's impossible, how to start my own sparql endpoint? Thank you very much! uYou can definitely put up your own SPARQL endpoint with your own facts. The DBpedia SPARQL endpoint is it's own thing, hosting data from the DBpedia project and not a place where everybody can upload their facts. If you want to query both of them together, you could load them all into one database. This is not that hard to do if you want to make queries themselves, although the cost and effort to run a large public SPARQL endpoint can get pretty high. I think most of the time people are not really interested in the whole graph but rather in some piece of it, so a common pattern use is to copy facts from various sources into a Jena model or dataset and query that. A large number of approaches to federated querying have been tried, ranging from the standard where this is done explicitly. It can also be done in an automated manner, see and a number of research and commercial products. u0€ *†H†÷  €0€10  `†He uSo are you saying he should use the Spunger to merge DBpedia and his own database and then publish that? u0€ *†H†÷  €0€10  `†He" "what are avaiable categories at dbpedia.org/resource/category:" "uHi All, I would like to know what are the categories are available at SELECT * WHERE { ?subject skos:subject . } LIMIT 20 Please help me. Regards, Amir Hussain Windows Live: Make it easier for your friends to see what you’re up to on Facebook. social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_2:092009 uOn 16 Dec 2009, at 06:07, Amir Hussain wrote: DBpedia URIs are case sensitive, so you have to be careful with uppercase and lowercase letters. You had: So it should be: and not your version: DBpedia is created by extracting structured data from Wikipedia. So you can just go to Wikipedia and look at the articles and categories there. Then just replace \" \". I found the motorcycle category by going to the Kawasaki article on Wikipedia, which is a disambiguation page. I found the link to the company, and then looked at its categories (on the bottom of the page), and had to go a few steps up in the category hierarchy to arrive at Category:Motorcycle_manufacturers. Best, Richard uthanks richard for helping me. richard can you help me to build query that return Automotive Industry with email addresses, location, specialization with a specified country . The following query in RED I build with your help but how but I does not return contact information and specialization. I also want to add country parameter so it return automotive industry only for specified country. SELECT ?property ?hasValue ?isValueOf WHERE { { ?property ?hasValue } UNION { ?isValueOf ?property } } please help me to refine the above query I will be thank full to all of you. I must submit my university assignment. Regards, Amir Hussain" "Generate nerd_stats_output.tsv from nerd-stats.pig (Hugo Silva)" "uHi Hugo, lists.sourceforge.net. The LANG parameter specifies the language of your wikidump. It's the ISO code for your language. You can specify it on the command line with: `-p LANG=en` Or you can use a config file with the LANG param inside it using the -m flag. See [1], [2]. You can also type: `pig -x local -h` to see all possible flags. The README [3] gives a good example of how to run scripts from the command line in local mode. Cheers, Chris [1] [2] [3] README.md" "Label issue for the Paris resource" "uThe heuristic to extract the promoted label for the HTML view of resources seams broken for the following resource: I get the following label as title of the page: About: Wikipédia:Le Bistro/2 septembre 2005 \"Paris\" would be a better title instead. I don't know what \"Le Bistro/2 septembre 2005\" refers to." "Concept hyperlinks missing on the endpoint" "uHi, I have a problem with the DBpedia query results using the SPARQL endpoint. I can not retrieve anymore hyperlinks of a Wikipedia concept. Do you have any idea about how I can find this information in DBpedia or where to ask? The query I am using is always the same: select * where { ?p ?o FILTER regex(str(?o),'^ But now in the results the original Wikipedia hyperlinks are missing. Result list NOW: Result list 2 MONTHS AGO http://dbpedia.org/resource/Category:Atmosphere http://dbpedia.org/resource/Category:Atmosphere http://dbpedia.org/resource/Category:Atmospheric_radiation http://dbpedia.org/resource/Category:Atmospheric_radiation http://dbpedia.org/resource/Category:Climate_change http://dbpedia.org/resource/Category:Climate_change http://dbpedia.org/resource/Template:global_warming http://dbpedia.org/resource/Template:global_warming http://dbpedia.org/resource/Atmosphere http://dbpedia.org/resource/Carbon_dioxide http://dbpedia.org/resource/Greenhouse http://dbpedia.org/resource/Global_warming http://dbpedia.org/resource/Convection http://dbpedia.org/resource/Black_body http://dbpedia.org/resource/Global_dimming http://dbpedia.org/resource/Outer_space http://dbpedia.org/resource/Water_vapor http://dbpedia.org/resource/Ann_Henderson-Sellers http://dbpedia.org/resource/Ozone http://dbpedia.org/resource/Earth http://dbpedia.org/resource/Climate_change http://dbpedia.org/resource/Radiative_forcing http://dbpedia.org/resource/Joseph_Fourier http://dbpedia.org/resource/Mars http://dbpedia.org/resource/Titan_%28moon%29 http://dbpedia.org/resource/Venus http://dbpedia.org/resource/Absolute_zero http://dbpedia.org/resource/American_Meteorological_Society http://dbpedia.org/resource/Positive_feedback http://dbpedia.org/resource/Earth%27s_energy_budget http://dbpedia.org/resource/Anti-greenhouse_effect http://dbpedia.org/resource/Callendar_effect http://dbpedia.org/resource/Cloud_forcing http://dbpedia.org/resource/NOAA http://dbpedia.org/resource/John_Tyndall http://dbpedia.org/resource/Svante_Arrhenius http://dbpedia.org/resource/San_Francisco%2C_California http://dbpedia.org/resource/Greenhouse_gas http://dbpedia.org/resource/Earth%27s_radiation_balance http://dbpedia.org/resource/Idealized_greenhouse_model http://dbpedia.org/resource/Infrared http://dbpedia.org/resource/Methane http://dbpedia.org/resource/Earth%27s_atmosphere http://dbpedia.org/resource/Intergovernmental_Panel_on_Climate_Change http://dbpedia.org/resource/Nitrous_oxide http://dbpedia.org/resource/Anthropogenic http://dbpedia.org/resource/Wisley_Garden http://dbpedia.org/resource/Addison-Wesley http://dbpedia.org/resource/Solar_greenhouse_%28technical%29 http://dbpedia.org/resource/Climate_forcing http://dbpedia.org/resource/Paleoclimatologists http://dbpedia.org/resource/Carbon_dioxide%23Variation_in_the_past http://dbpedia.org/resource/Chlorofluorocarbon%23Chloro_fluoro_compounds_.28CFC.2C_HCFC.29 http://dbpedia.org/resource/Pluto%23Atmosphere Thanks in advance. Angela." "Please describe more files; topical_concepts is not quite correct" "uThe following files at are not described at On the right are my brief notes, but it would be great if Volha or someone else could add them to the excellent page above: article_templates all templates used on a page, dbo:wikiPageUsesTemplate interlanguage_links_chapters ??? topical_concepts the topical page for each category, eg dbp:Category:Programming_languages skos:subject dbp:Programming_language genders determined heuristically, by counting he/his/him vs she/hers/her freebase_links more at geonames_links more at flickr_wrappr_links unfortunately the service doesn't work, always returns \"unable to locate any photos\" infobox_test not used pnd not used (obsolete) surface_forms this file is described but does not exist for download 1. Volha, could you describe what is interlanguage_links_chapters and how is it different from interlanguage_links? 2. skos:subject as used in topical_concepts is not good, since there's no such property. I'd suggest to use foaf:focus \"The underlying or 'focal' entity associated with some SKOS-described concept.\" uHi Vladimir, just to answer your first question shortly. interlanguage_links_chapters only contains the interlanguage links between those languages for which a DBpedia mapping chapter exists and provides dereferencable URIs. At the time of extraction of DBpedia 2014 these languages were: eu, cs, nl, en, fr, de, el, id, it, ja, ko, pl, pt, es. Thus, interlanguage_links_chapters is a subset of interlanguage_links. Cheers, Daniel" "Different data results for URI from SPARQL endpoint" "uHi all, I have a problem getting all the data for a particular URI using the SPARQL endpoint When I run the query select distinct ?property ?object where { ?property ?object} I get only 3 properties back (see results here But on when I use the URI in the browser it clearly has more data: Am I missing something here??? Thanks uHi Igor, On 04/26/2012 12:15 PM, Igor Popov wrote: uHi Mohamed, Thanks for the reply. But this makes no sense to me, strictly talking like a developer. Why have two URIs, and in terms of coding that does mean that every time I get a URI I need to check for this property and if it does do another query with that URI. It looks like overkill to me. I'm sure there is some reason for this - I'm just trying to understand if I need to make additional queries. Sorry, I'm not a DBpedia expert. u0€ *†H†÷  €0€1 0 + uIn a nutshell: is a redirect to and that's why is a redirect to JC On Thu, Apr 26, 2012 at 13:19, Kingsley Idehen < > wrote: uThanks for the clarification." "DBPedia Lookup" "uHi, The DBPediaLookUp (lookup.dbpedia.org) service is down for sometime now. Any idea why? And when will it be up and running again? *Rohana RajapakseSenior Software DeveloperGOSS Interactive* t: +44 (0)844 880 3637 f: +44 (0)844 880 3638 e: w: www.gossinteractive.com Hi, The DBPediaLookUp  ( lookup.dbpedia.org ) service is down for sometime now. Any idea why? And when will it be up and running again? Rohana Rajapakse Senior Software Developer GOSS Interactive t : +44 (0)844 880 3637 f: +44 (0)844 880 3638 e: w: www.gossinteractive.com" "Local acces to virtuoso endpoint error" "uHi. I get the following error message: \"42001 Error SR185: Undefined procedure SPARQL.SPARQL.st_point. \" when executing following query : \"PREFIX geo: SELECT DISTINCT ?m ?geo WHERE { ?m geo:geometry ?geo . FILTER ( bif:st_intersects ( ?geo, bif:st_point(13.379273,52.516863), 30 ) ) } \" on a local virtuoso endpoint with dbpedia also locally installed. No problem executing the query in the public SPARQL DBPedia endpoint. Had someone a similar problem, or any idea why this happens? Thanks! uHi Ruben, I presume the are using the Virtuoso open source product for hosting your local DBpedia instance which does not have the Geo Spatial support available in the Virtuoso commercial product that hosts public DBpedia SPARQL endpoint ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 7 Oct 2010, at 14:20, Ruben Navarro-Piris wrote: uHi Ruben, Note, you can obtain an evaluation copy of the Virtuoso commercial product to test from: for one of the available OS'es and reuse the existing open source database file you have DBpedia loaded into already as they are compatible. Alternatively, we do have an Amazon AWS snapshot of the currently available DBpedia 3.5.1 public endpoint, which is a 4 node clustered database, that can be attached to a Virtuoso EC2 AMI instance in the cloud as detailed at: I hope one of these solutions is suitable for your needs Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 8 Oct 2010, at 08:20, Ruben Navarro-Piris wrote:" "getting images from a wikipedia article" "uCan anyone please tell me how images from a certain wikipedia article can be obtained by performing sparql query? thanks uHi, select ?image where { ?image . } You can also try Georgi" ""title" question" "uhi! first of all, thx for bringing the download server back :) My question now is, regarding the titles. If we take this for example: \"AfghanistanTransnationalIssues\"@en . so normally the title would be: \"Afghanistan Transnational Issues\", with blanks in between, right? So another example: \"Air Transport\"@en . here the resource is written like that: Air_Transport spereated with an underline and the very title is written correctly as: \"Air Transport\" So why this difference here, namely \"AfghanistanTransnationalIssues\" written all together and \"Air Transport\" What I would need is the real name as it is the real title on wikipedia itself (With blanks in between and upper and lower case). Is that possible? best martin uHi Martin, We are in fact taking the titles as they appear on the top of Wikipedia pages. The example you sent, AfghanistanTransnationalIssues, is a redirect page: The title of this redirect page, does not include spaces. For Air_Transport, the title of the (redirect) page includes spaces: To get the titles of main articles, you could, for instance, follow the redirect links. I hope this helps. Best, Max On Fri, Jun 25, 2010 at 4:37 PM, Martin Kammerlander < > wrote: uMax, thx for the information! I might got a bit confused by the fact that dbpedia names those 'redirection titles' as titles. Am I seeing this right that ie. AfghanistanTransnationalIssues is something like the \"ID\" of a wikipedia article? So something that never changes, where the \"real title\" (in this case \"Foreign relations of Afghanistan\") might changes? Yes Max I can follow your advice ad follow the redirect links and parse out the 'real title'. Since those titles are the thing I'm currently interested in. However it would be pretty awesome if dbpedia could offer also the \"real titles\" on the download page. In my eyes this would totally upgrade the dpedia information. best martin Zitat von Max Jakob < >:" "Problems with LinkedData access to DBpedia" "uHi all, the liked data access to DBpedia seems not to work currently. Is this a known issue? Ex: shows: \"Execution of \"/DAV/VAD/dbpedia/description.vsp\" failed. SQL Error: 42001 SR185: Undefined procedure[]\" Raphael uHi Raphael, The issue has been resolved, please try again Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 5 Oct 2009, at 10:55, Raphael Doehring wrote:" "What is the difference between DBpedia categories and Yago types ?" "uI noticed that DBpedia categories and Yago types are roughly the same; is there any difference that I could not distinguish ? I noticed that DBpedia categories and Yago types are roughly the same; is there any difference that I could not distinguish ?" "sparql probs" "uHi all, I am trying to send SPARQL queries to SOAP but am getting HTTP 500 back. The query I am sending is the default one ie \"select distinct ?Concept where {[] a ?Concept}\". Can anyone be so gentle so as to explain exactly how to send the queries via SOAP? Thanks a lot for the co-operation uOn 6 Dec 2007, at 08:14, Savio Neville Spiteri wrote: AFAIK the endpoint doesn't support SOAP. Sending the query via plain old HTTP works. Richard uSavio Neville Spiteri wrote: Savio, You mean the SPARQL Protocol over SOAP? If so, then it should work. We support SOAP 1.1 & 1.2 . Please try: Kingsley uSavio Neville Spiteri wrote:" "Can not connect to http://dbpedia.org" "uHi I know it is funny but my computer connects all sites except But, even if I can ping the dbpedia now still I can not access Why can it be ? Hi I know it is funny but my computer connects all sites except ?" "Access to Movie-Awards Data for another OpenSource Project" "uHello all, I'm new to dbpedia, very interested and just have read a lot of available documents about the whole project. My special interest is to use dbpedia to generate a database of movies, that were nominated or are winners of several movie-awards, with language-support for at least en, fr, de and dk (AFAIK the most users of the software come from there). The open-source-project I'm talking about is called \"TVBrowser\" (www.TVBrowser.org), a free of charge EPG (Electronic Program Guide). There is still a \"handmade\" Award-Database at but it takes too much time to keep it uptodate and to add translations, find mistakes, add new awards by hand and with makros Because I'm also new to SPARQL, I may not find the correct and optimum Query, to get the data for the Award, I want to get. That's what I need for each movie with awards: Award_Name (en,de,fr,dk), Movie (en,de,fr,dk), Category (en,de,fr,dk), Production_Year, Year_of_Nomination/Win, Reciepient(s), Director (each Hyperlink to Wikipedia would be nice, too) My main problem is to Filter the movies by an award and by \"nominated\" and \"winner\". I guess, to get the rest of information is more easy then. Is it possible, that there is no possibility at this time given, to find out, if a movie has won or was nominated? It would be very kind, if someone could help me a little bit with my problem here. Thanks for every help! Regards, tvb_user GRATIS für alle WEB.DE-Nutzer: Die maxdome Movie-FLAT! Jetzt freischalten unter http://movieflat.web.de" "How can I extract information from wikipedia comparison tables?" "uExample: I want to choose software project management tool (with issue tracking). I found plenty of information on pages: Comparison of issue-tracking systems and Comparison of project management software. The problem is that it's not convenient to select tool using 10 tables. I want to write something like SQL query and find all matches. I know about DBpedia which offers SPARQL interface, but as I understand it, it extracts information only from infoboxes and categories, so it lacks information about specific features which are described in comparison tables, not in infoboxes. I actually wrote small script (on Python) which parses on fly several wikipedia tables saved as CSV files and applies filters. With it I found that I have to choose between Redmine, Bloodhound, , but I don't like this approach because it's completely not universal and limited. Does anybody has suggestions how to extract information from wikipedia comparison tables in a \"right way\"?" "unknow chars" "uHi, this query select distinct ?onto ?graph {GRAPH ?graph {?onto a < whats wrong with the 1st(blank) result? and what char is that on the other two? onto graph ?p?swrc thanks in advice, Hi, this query select distinct ?onto ?graph {GRAPH ?graph {?onto a < advice," "Missing mappings in extraction_framework" "uHi, I tested the new java/scala extraction_framework and looked up also the old php code. Many mappings that did work before, are missing now. Can somebody explain to me why the development moved away from working code to the new java platform? I went almost one year back in this mailinglist and did not found any hint for this. I assume it was for performance reasons, but the old code looks more complete to me. Thanks,   Jean Hi, I tested the new java/scala extraction_framework and looked up also the old php code. Many mappings that did work before, are missing now. Can somebody explain to me why the development moved away from working code to the new java platform? I went almost one year back in this mailinglist and did not found any hint for this. I assume it was for performance reasons, but the old code looks more complete to me. Thanks, Jean uHi, I wrote to this group a couple days ago regarding some missing mappings. In the meantime I found out how to use the MappingTool and corrected some entries and sent them back to dbpedia. It seems that those entries are not reflected and I guess it's because I'm a new user. Can somebody who is in charge check my account for proper permissions? One example change I made was in German language: Vorlage:Infobox Ortsteil einer Gemeinde in Deutschland The old incorrect entry was (is):     {{ GeocoordinatesMapping | latitudeDegrees = lat_deg | latitudeMinutes = lat_min | latitudeSeconds = lat_sec | longitudeDegrees = lat_deg | longitudeMinutes = lon_min | longitudeSeconds = lon_sec }} I changed \"longitudeDegrees = lon_deg\", since it was mistyped. I also added new mappings in other ontologies. My username in mappings.dbpedia.org is \"jeandbp\" Thanks,   Jean Gould Hi, I wrote to this group a couple days ago regarding some missing mappings. In the meantime I found out how to use the MappingTool and corrected some entries and sent them back to dbpedia. It seems that those entries are not reflected and I guess it's because I'm a new user. Can somebody who is in charge check my account for proper permissions? One example change I made was in German language: Vorlage:Infobox Ortsteil einer Gemeinde in Deutschland The old incorrect entry was (is): {{ GeocoordinatesMapping | latitudeDegrees = lat_deg | latitudeMinutes = lat_min | latitudeSeconds = lat_sec | longitudeDegrees = lat_deg | longitudeMinutes = lon_min | longitudeSeconds = lon_sec }} I changed \"longitudeDegrees = lon_deg\", since it was mistyped. I also added new mappings in other ontologies. My username in mappings.dbpedia.org is \"jeandbp\" Thanks, Jean Gould uDone. Happy mapping! On Tue, Feb 14, 2012 at 7:52 PM, Jean Gould < > wrote: uHi Pablo, Thanks! I changed my first two mappings and tested them. When you go to {{Begriffsklärungshinweis}} {{Infobox Gemeinde in Deutschland |Art               = Stadt |Wappen            = Wappen-stadt-bonn.svg |Breitengrad       = 50/44/02.37/N |Längengrad        = 7/5/59.33/E I mapped Breitengrad to latitude and Längengrad to longitude. But the parser does something wrong and takes only the first number for the mapping. In this case 50 and 7. I looked also in the generated nt- and nq-files and see that the data is wrong there, too. I guess it is treated as a numeric field, where it should be treated as a string which needs to be parsed sometimes.   Jean Hi Pablo, > Done. Happy mapping! Thanks! I changed my first two mappings and tested them. When you go to Deutschland |Art = Stadt |Wappen = Wappen-stadt-bonn.svg |Breitengrad = 50/44/02.37/N |Längengrad = 7/5/59.33/E I mapped Breitengrad to latitude and Längengrad to longitude. But the parser does something wrong and takes only the first number for the mapping. In this case 50 and 7. I looked also in the generated nt- and nq-files and see that the data is wrong there, too. I guess it is treated as a numeric field, where it should be treated as a string which needs to be parsed sometimes. Jean uHi Jean, In general, there is no real standard way to fill in infobox values. It can contain numbers, links and even other templates. This lack of regularity makes parsing it a pain. For example, in English, the field is filled like this: |lat_deg = 50 |lat_min = 44 |lat_sec = 02.37 |lon_deg = 7 |lon_min = 5 |lon_sec = 59.33 While in German, it is like this: |Breitengrad = 50/44/02.37/N |Längengrad = 7/5/59.33/E There seems to exist already a GeoCoordinates parser in the DEF, that gets triggered when you use GeocoordinatesMapping instead of a PropertyMapping. You would probably have to extend the GeoCoordinateParser to include this new way to express lat/log Or you can try to convince Jona to extend it for you. :) Cheers, Pablo PS: This answer was written with help from Anja, Robert and Jona. On Wed, Feb 15, 2012 at 3:30 PM, Jean Gould < > wrote:" "BDM Calculation" "uDear All Members, We want to calculate Balanced Distance Metric (BDM) for different entities in YAGO2s and DBpedia. When using GATE for calculating the BDM score some memory related issues are coming. Is it possible that we get some pre-calculated BDM scores against DBpedia entities from somewhere so that we do not have to calculate these scores again. Thanks, Dear All Members, We want to calculate Balanced Distance Metric (BDM) for different entities in YAGO2s and DBpedia. When using GATE for calculating the BDM score some memory related issues are coming. Is it possible that we get some pre-calculated BDM scores against DBpedia entities from somewhere so that we do not have to calculate these scores again. Thanks," "how to register a RDF into DBPedia" "uDear All; How are you? I am a kind of beginner for DBPedia. I am wondering a way to register a RDF (geopolitical ontology which contains country and region information) into DBPedia. I have the URL for it. It is an OWL format, but we can change into RDF. Is there any good reference or site for describing a way to register? Thanks for your answer in advance. : ) Best Regards, Soonho Kim uHello, I'm not quite sure, what do you mean by register. In general, it is sufficient to include outgoing links in your data set to DBpedia. e.g. fao#organization owl:equivalentClass db-owl:Organization . or fao:Togo owl:sameAs db-owl:Togo . Another way could be to include a schema class mapping into the wiki, but this is currently untested (I'm not sure if we allow to set equivalentClasses to arbitrary namespaces or just a whitelist, as this is prone to ontology hijacking ). If you cover a large domain of topics included in Wiki/DBpedia like person, movies, medicine it might make sense to include backlinks to your data set in DBpedia directly. If this is the case, please send us the links you created. Now that I think of it, it would make sense to have a portal, where we everybody could upload there mappings to and from DBpedia. It would need to have a revision system, so people could keep their mappings up to date. Has anybody heard of such a system, where third parties can upload links, which are then published? Regards, Sebastian Am 27.07.2010 16:53, schrieb Kim, Soonho (OEKM): uSebastian Hellmann wrote: The semweb is really about AAA (anybody can say anything about anything) and that's not scalable if we need to hassle the dbpedia people just to \"link\" with dbpedia. How about Hugh Glaser's sameas.org? Or alternatively, you can publish an NT file that asserts owl:sameAs (or something similar for everything) and tell interested people they can download it. Now some people don't like owl:sameAs, some go so far to say Hugh is a terrorist who's trying to bind everything into one big Katamari ball, and my answer to that is a new predicate I'm working on. It's just like owl:sameAs but it doesn't mean anything uOn 8/19/10 4:08 PM, Paul Houle wrote: Based on the AAA principle, its impractical for DBpedia to be the arbiter of who or how its referenced, really. A Web of Linked should simply be about cross references being discovered in a myriad of ways th: 1. Twitter announcements 2. Mailing list posts 3. Blog posts 4. Serendipitous Discovery via lookup spaces and services etc Paul: if you have a linkbase, you should publish at a URL. I can then load it into its own Named Graph within one of the many Linked Data Spaces that we host (including the Virtuoso instance hosting DBpedia). Kingsley" "How does DBpedia live extractor work?" "uHi, How does DBpedia live updater (extractor) work? Does it go over all changes in Wikipedia and update their correspondences in DBpedia? I updated some Wikipedia articles (Both article body and infobox)that have corresponding entries in DBpedia and kept monitored them in DBpedia along with changeset for days, but they were not updated. uHi Abdulfattah, DBpedia is explained in detail in the DBpedia live paper you can find here: The last couple of months DBpedia live is stalled due to a change in the Wikipedia update stream The French Chapter took over the fix and they are close to done After rolling the fix we will start parsing the updated pages but will take some time to get 100% up to date On Mon, May 9, 2016 at 6:56 PM, Abdulfattah Safa < > wrote: uHi Dimitris, Thanks for your explanations. So my procedure in monitoring updates is correct. Right? I'm doing it on English Wikipedia. Another thing, should the changes you mentioned be applied on each chapter separately? Greetings, Abed On Tue, May 10, 2016 at 11:33 AM, Dimitris Kontokostas < > wrote:" "Getting started with DBPedia" "uHi all, I'm Prashanth, a student from India. I have been reading articles about the Semantic web and I'm really excited to work on that. To start I decided to create a \"W\" type question search engine which will give me answers for all those \"W\" type question Eg., \"Who is the Prime minister of India\"- Ans: Dr Manmohan Singh \"When was he born?\" \"Who are all the musicians who were born in India\" Questions like these. So I turned towards DBPedia which will help me in achieving my goal. I'm comfortable with Java/C# and bit of python. I request anyone of you to please guide me to where I should get started. Thanks in advance. uYou should perhaps start with the QALD1 workshop. Cheers Pablo On May 27, 2012 7:57 AM, \"Prashanth Swaminathan\" < > wrote: You should perhaps start with the QALD1 workshop. wrote:" "setting up DBpedia on local machine?" "uHi everyone, I'm trying to get DBpedia to be installed on my local machine using a trial version of Virtuoso. I do not have much experience at all in databases and I'm running into all sorts of questions I'm struggling to find answers to. I've read the other thread a few posts back about Savio having the same question, but mine also involves the basic steps in getting everything set up. What I have are the datasets downloaded already, and I have the Virtuoso server all ready to go. The question is, how do I import the datasets (the .nt files) into Virtuoso? I am very lost dealing with the Virtuoso Composer interface and am on the verge of giving up. The only database experience I have is with mySQL and phpMyAdmin so please bear with my beginner's questions. That said, is it for me to dump the .csv versions of the datasets into mySQL, and how, if at all possible, will I be able to use SPARQL to retrieve data? Lastly, and most importantly, are there free alternatives to Virtuoso for triple stores/SPARQL execution? I've read in the same thread that Luis Stevens has an alternative. Luis, if you can tell me how you got your local DBpedia running (preferably with instructions to guide me through) with Redland, I will really, really appreciate it. Thanks, Andrew uHello, Andrew (Chuan) Khoo schrieb: Virtuoso has an Open Source version [*] (just in case you thought it is not open). Other alternatives are Jena and Sesame. See between them. Jens [*] virtuoso" "The LAST INVITATION for ICDIPC2016, Beirut, Lebanon - April 21, 2016" "u[Apologies for cross-posting. Please forward to anybody who might be interested.] The Sixth International Conference on Digital Information Processing and Communications [ICDIPC2016] April 21-23, 2016 - Beirut, Lebanon Website: Email: Paper Due : Extended to April 6, 2016 To submit your paper: TOPICS: * Information and Data Management    * Social Networks * Data Compression    * Information Content Security * E-Technology    * Mobile, Ad Hoc and Sensor Network Management * E-Government    * Web Services Architecture, Modeling and Design * E-Learning    * Semantic Web, Ontologies * Wireless Communications    * Web Services Security * Mobile Networking, Mobility and Nomadicity    * Quality of Service, Scalability and Performance * Ubiquitous Computing, Services and Applications    * Self-Organizing Networks and Networked Systems * Data Mining    * Data Management in Mobile Peer-to-Peer Networks * Computational Intelligence    * Data Stream Processing in Mobile/Sensor Networks * Biometrics Technologies    * Indexing and Query Processing for Moving Objects * Forensics, Recognition Technologies and Applications    * Cryptography and Data Protection * Information Ethics    * Peer-to-Peer Social Networks * Fuzzy and Neural Network Systems    * Mobile Social Networks * Signal Processing, Pattern Recognition and Applications    * User Interfaces and Usability Issues form Mobile Applications * Image Processing    * Sensor Networks and Social Sensing * Distributed and parallel applications    * Social Search * Internet Modeling    * Embedded Systems and Software * User Interfaces,Visualization and Modeling    * Real-Time Systems * XML-Based Languages    * Multimedia Computing * Network Security    * Software Engineering * Remote Sensing WE ARE SINCERELY LOOKING FORWARD TO SEE YOU IN BEIRUT IN APRIL 2016 [Apologies for cross-posting. Please forward to anybody who might be interested.] The Sixth International Conference on Digital Information Processing and Communications [ICDIPC2016] April 21-23, 2016 - Beirut, Lebanon Website:" "URLs that aren t cool" "uYes, such as and I agree that this can be annoying. One have to make sure to not lose the case information (as it happened to me with lookup.dbpedia.org once, hence merging FROG and Frog). But what do you suggest to do about that, Paul? Should Wikipedia make URLs case-insensitive and then enforce disambiguation with ()? Best, Georgi uGeorgi Kobilarov wrote: If (wikipedia) were my site, I'd do two things: (i) map all case-variant forms to a single form (New yOrK cITy -> New York City;) \"FROG\" gets renamed to \"FROG Cipher\" or \"Frog (Cipher)\" (ii) do a permanent redirect from variant forms to the canonical form I think what dbpedia is doing is reasonable considering the situation. My own system for handling generic databases has both a VARBINARY and VARCHAR field for dbpedia URLs/labels. It does a case-insensitive lookup first, and if that fails, looks at the alternatives that turn up. It's also got some heuristics for dealing with redirects, disambiguation, and all that. In the big picture I see \"naming and identity\" as a specific functional module for this kind of system uIt's a stretch for MediaWiki but this kind of topic went through my head when I was thinking how I'd model my own wiki engine. Personally I'd drop the 1 to 1 Title <=> Blob relation (where blob is either page text, or text indicating a redirect to another title with a blob). My thoughts have been more like: - Title: The visual representation of the title; ie: What shows up in the tag and the - Name: Multiple identifiers which can be used to refer to the page. (ie: One of the possible things you'd type in the url to get to the page) A page would probably have a primary name which the secondary names would redirect to. (Likely just the first name in the list) The equivalent to adding a set of \"redirects\" would be editing the page, and adding a few more \"names\". The equivalent to moving a page would be editing the page, and replacing one of the names with another. My thoughts on disambiguation and cases like Frog and FROG were: 1) When looking for a page first the engine would look for case-sensitive matches and if one is found return that page. 2) If multiple exact matches are found then a automatic disambig page is returned, unless one of those pages have been marked as the primary page for that specific name, in which case that primary page is used instead. 3) If no matches are found, a case-insensitive match is looked for. 4) In this case if multiple case insensitive matches are found it compares them and uses the one which is more case similar. 5) Otherwise if the multiples share the same name then it follows the same as 2) for exact matches 6) Falls back to a \"Did you mean\", \"or you could create this page>\" kind of page. A) Pages with ambiguous names have links to the disambig pages on them; or in the case of a single ambiguity just a link to that page. Basically, for pages like \"Ownership\" which have no terms to disambiguate could be reached by /ownership /OWNERSHIP /Ownership, etcif the name \"Ownership\" was used. But for pages like \"Frog\" and \"FROG\" if the page on the frog species used the name \"Frog\" while the page on the FROG Cypher used the name \"FROG\". Then /Frog would point to the frog species page, /FROG would point to the FROG Cypher page because it's a case match. And /frog would point to the frog species page because it is more case similar to \"Frog\". And both Frog and FROG would have ambiguity links pointing to each other. The system would probably have a bit of hinting while editing about pagename ambiguities, and would have report pages for things like \"pages with only an ambiguous name\" (ie: Say Frog and FROG both used the name \"frog\" and used no other name; This would force the system to always display a disambig page and have to do something like using a page uid to refer the pages; These would be listed so they could be fixed) ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [ Paul Houle wrote:" "Sparql-query fetching uri from Danish Wikipedia URL" "uHello I'm trying to find the dbpedia-uri from a danish wikipedia URL. Is that possible at all? The danish wiki-url is mentioned here: But I get no results, when I try to do this: PREFIX dbpprop: SELECT ?uri WHERE { ?uri dbpprop:wikipage-da \" } Am I doing it wrong? MVH /Johs. W. uHi Johannes, The the object in your where clause is an IRI not a literal , and thus should be specified with <> rather than with double quotes \" \" , so the query should be: PREFIX dbpprop: SELECT ?uri WHERE { ?uri dbpprop:wikipage-da . } which returns subject uri Note you also need to ensure the graph name specified is Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 10 Mar 2010, at 08:51, Johannes Wehner wrote: uHi Johannes, unfortunately the according dump isn't loaded to the DBpedia SPARQL endpoint. The list of loaded dataset can be found here: The dataset you need would wikipage_da.nt which can be downloaded from the Downloads section. Cheers, Anja On 10.03.2010 09:51, Johannes Wehner wrote: uExcellent! Thank you!!" "Commons & pictures mapping" "uis a very important dataset, describing what's in effect a global multimedia library. 0. It would be great if it could be described at I guess many people have missed it simply because it's not described. 2. Regarding the mapped props: I've checked several of the mapping pages, and they are unused. 3. Regarging I checked http://commons.dbpedia.org/resource/Herbert_Eugene_Valentine: - I'd suggest to change hasNationalArchivesIdentifier to sameAs, since the value is http://research.archives.gov/description/533197 and that's the URL representing that image - I'd suggest to change identifiedBy to dc:identifier, to fit common use Do you agree? Please comment on the Discussion page of that mapping 4. Could someone fill me in on what's the best way to map images, coatOfArms (I made such a property), etc? E.g. for this Bulgarian tzar: https://bg.wikipedia.org/w/index.php?title=Ñèìåîí_I&action;=edit It has these fields: | èçîáðàæåíèå çà ëè÷íîñòòà=[[Ôàéë:SimeonTheGreatAntonoff.jpg|300px]] | îïèñàíèå íà èçîáðàæåíèåòî=Ñèìåîí ïðåä ïîðòèòå íà Êîíñòàíòèíîïîë (ãðàôèêà [[Íèêîëàé Ïàâëîâè÷]]) Which are mapped like so: {{ PropertyMapping | templateProperty = èçîáðàæåíèå çà ëè÷íîñòòà | ontologyProperty = picture }} {{ PropertyMapping | templateProperty = îïèñàíèå íà èçîáðàæåíèåòî | ontologyProperty = pictureDescription }} Unfortunately in the result: http://mappings.dbpedia.org/server/extraction/bg/extract?title=Ñèìåîí+I&revid;=&format;=turtle-triples&extractors;=custom * picture is not mapped, maybe because it's not at Commons but a local attachment [[Ôàéë:]] * pictureDescription is mapped to the URL of the painter , since it's an objectProperty" "URIs in 404?" "uI get 404 errors when accessing any resource in the dbpedia.org/data namespace, such as: The other parts of the site seem to be working fine. What's up? Richard" "resource redirects" "uDear Dbpedia Team and Community, Could you please explain me resource redirects? For example, clicking at How to get a resource corresponding to Celtic rock band)? Many thanks in advance Pavel uHello Pavel the data currently displayed at version 3.8. It was extracted from a dump of en.wikipedia.org from 1 June 2012: At the time, redirect to changed on 18 June 2012 - too late for inclusion in DBpedia 3.8: We will publish DBpedia 3.9 soon. In this release, Regards, Christopher On 24 August 2013 17:11, Pavel Smrz < > wrote:" "French version of dbpedia-live" "uHello, I'm currently working on a French version of dbpedia-live and would like to know if my approach is good. In our case we want to have a live updated each 24h and not in real time like in the international version. I will use the live extraction framework published on . I have (like in the international live) a local wikipedia mirror communication with my live using OAI. But in order to retrieve the wikipedia upstream modifications I have no access to the wikipedia OAI and so i want to use th incremental dumps published everyday by wikimedia . Do you think it's a good approach ? I'm currently working on the first extraction of data to virtuoso and will work on the incremental dumps import next week. Maybe this kind of work has been made in other country earlier ? For the mapping wiki what's the best solution to plug the extraction framework with ? I spend some time on your mailling list looking for informations about the live. (My result is on ) Thanks, Christophe uHi Christophe, You can find more information about the Live Extraction and how to set it up here: . I've written some scripts to automate the process and I can also set it up or guide you trough it. If you are the maintainer of a DBpedia language chapter, you can request an OAI proxy access from Dimitris. The approach with the Incremental Dumps is intriguing since it doesn't require access to the OAI-PMH stream which would enable a lot of people to host experimental clones of the system. However we might get a lot more extraction errors using it: \"*Here's the big fat disclaimer.* /This service is experimental./ At any time it may not be working, for a day, a week or a month. It is not intended to replace the full XML dumps. We don't expect users to be able to construct full dumps of a given date from the incrementals and an older dump. We don't guarantee that the data included in these dumps is complete, or correct, or won't break your Xbox. In short: don't blame us (but do get on the email list and send mail: see xmldatadumps-l ).\" I don't know if anyone tried using the incremental dumps, I didn't know they existed, this service seems to be quite new. The extraction framework should actually work out of the box with these dumps, either with the dump extraction or by packing the dumps in a mediawiki instance and calling the live extraction. I will try it out. Cheers, Alexandru On 04/07/2014 12:30 PM, Christophe Desclaux wrote: uOn 07/04/2014 14:38, Alexandru Todor wrote: yep thanks for the scripts. They are really useful. ok i will yes it's what i want. Yes the disclaimer is really huge but it seems to be the only solution in order to have updated dumps (without OAI-PMH access). According to their mailling list it seems to be working since 4 years and the project was lunched in 2009 ok cool, i will try this and give some feedback. Thanks, Chris u‰PNG " "using DBpedia locally" "uHi all, I am doing my thesis to get my B.Sc. I.T. (Hons) degree. As part of my Final Year Project (which is the thesis itself) I am investigating how use of ontologies could be made to help users bookmark webpages. I would like to use dbpedia as a kind of a 'universal ontology' in such a way that the bookmark file structure would be like the 'natural order of things'. I have a query with regards to how I actually should interact with the system. Since the system would be using dbpedia over and over again (for example bookmarking a single page would most probably result in a (large) number of queries), it definitely wouldn't make sense to send a sparql query to the sparql endpoint each time, else bookmarking a single page would take very long indeed! I need to have a local version of the entire data set. Is this possible? (Which version of the dataset should i download from * way to interact with the local source? I appreciate your help a lot, since i am a novice in this area, and i am still feeling a little bit like a stranger to all this! Thanks a lot, Thanks for your co-operation, uSavio, On 13 Feb 2008, at 16:57, Savio Neville Spiteri wrote: Interesting project. Just a word of caution: Note that DBpedia isn't an ontology. From an ontology point of view, DBpedia is just a bunch of instances, without a nice well-engineered class hierarchy. Two different experimental class hierarchies are available for DBpedia (YAGO and CWCC), but both have their flaws, are very complex, and I wouldn't say that they represent the \"natural order of things\" uHi Richard, Richard Cyganiak wrote: uSavio, We at knogee.com use the DBpedia data set locally. We use Ruby for everything, so we use the redland libraries which are C libraries, but have multiple language interfaces: If you'd like an example of the code we use to read the .nt files and transform them into a database for the redland SPARQL queries, let me know Cheers, Luis. Quoting Richard Cyganiak < >: uLuis Stevens wrote: Virtuoso definitely is a good choice, however to allow even simpler scenarios we also produced (starting from the last release) CSV files, which can be load into leagacy SQL databases. In MySQL for example using the \"LOAD DATA INFILE\" syntax. Of course afterwards you will not have the comfort to use SPARQL or any RDF specific functionality, but sometimes it is useful to keep it simple and stupid ;-) Sören uHello, Michael K. Bergman schrieb: uThanks, Jens. Mike Jens Lehmann wrote:" "YAGO classes in dbpedia?" "uHi All, I have a very basic question about yago classes. Could anyone tell me where I can find the hierarchy of yago classes in dbpedia? i.e. all the subclass equivalent class axioms for these yago classes. I don't see yago class axioms in the ontology file. I also checked the Original YAGO classes, but it seems that dbpedia uses different names. e.g. it is CelticPeople in dbpedia and it is wikicategory_Celtic_people in yago. By applying such simple mapping between classes, I can find only a very small portion of matches. However dbpedia certainly has the taxonomy axioms. e.g. of I can find the file for such axioms. uHi Xingjian, On 04/02/2013 11:15 PM, Xingjian Zhang wrote:" "Problem with DBPedia extraction" "uI am trying to prepare a simple DBPedia Armenian version. I tried to follow these instructions: But they seem to be out of date. Are there more recent instructions? I have found some messages in this list that explain the use of the \"run\" command. In fact, In fact, I was able to download the armenian files running the following instruction: /run download config=download.minimal.properties However, when I try to run: /run extract extraction.hy.default.properties I obtain the following: [INFO] Scanning for projects[INFO] uHi Labra, In fact, I was able to download the armenian files running the Can you try /run *extraction* extraction.hydefault.properties you should change this to languages=hy for now Besides that, everything else seems ok Best, Dimitris uThanks for your answerI have tried again with that change and now I get the following error: INFO: Will extract redirects from source for hy wiki, could not load cache file '/home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj': java.io.FileNotFoundException: /home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj (No existe el archivo o el directorio) I noticed that this error is similar to the one posted in this message ( could not find an answer The command that I executed is: /run extraction extraction.hy.default.properties Best regards, Labra The full output is the following: [INFO] Scanning for projects[INFO] uCan you check if you have this in your properties file? source=pages-articles.xml.*bz2* It seems like it is searching for an unzipped version but your download config doen't unzip it On Sun, Oct 14, 2012 at 12:48 AM, Jose Emilio Labra Gayo < >wrote: uI checked and I did not have. In the download properties file, I have download=hy:pages-articles.xml.bz2 I also had: unzip=false and I changed to unzip=true but it remails the same. I attach the properties files that I am using. The commands that I execute are: $ /run download config=download.hy.minimal.properties [INFO] Scanning for projects[INFO] uCan you pull and update from the repo. What you encountered is a (new) known issue where the mediawiki updated it's schema to 0.7 and the mappings wiki is still on 0.6. I disabled the namespace check for now until this is fixed. I think that now it should work Best, Dimitris On Sun, Oct 14, 2012 at 7:43 PM, Jose Emilio Labra Gayo < >wrote: uI pull and updated the repository, also, I did \"mvn clean install\" to recompile the sources. However, I obtain the following errorit seems related to the namespace, but I am not sure which XML file is trying to parse /run extraction extraction.hy.default.properties [INFO] Scanning for projects[INFO]" "Inconsistency Feedback from DBpedia to Wikipedia What's disjoint in the dbpedia ontology?" "uHi all Indeed. And it should stay that way, and of course it is bound to be globally inconsistent forever, as Wikipedia is and will continue to be. There is no way it could be otherwise, since collective knowledge is inconsistent passed a certain scope and size. And this is IMHO rather a good thing. The perspective of a globally consistent view of the world is really a totalitarian nightmare! Even each of our individual knowledge(s) is inconsistent, I guess. At least so is mine. :-) So what is DBpedia ontology good for? Obviously for (partial) use in data linking and integration, but certainly not for building a global consistent view of the world à la Cyc. But a potential benefit of reasoning on this ontology, and maybe the most interesting one in the long run, is to explicit inconsistencies and feed them back to Wikipedia curators (as has been said, with all due diplomacy to Wikipedia work). Editors down there are often very well aware of inconsistencies, they are often pointed in discussion pages, for instance, decisions about it might be pending due to edition wars etc. But pinpointing them by expliciting the semantics does help. Some inconsistencies can be obvious errors or categorisation, like putting a place in a \"person\" category, but I guess many (most) of them actually point to things not yet clearly defined, or about which conflicting ontologies coexist. In such cases, they are topics about which knowledge is in the making, very often the most interesting ones! Maybe it could be suggested to Wikipedians to create a category \"Logically Inconsistent\", to mark those articles and categories. This shows at Web scale what I experience daily in data migration to semantic environments. Even if, as often happens, the resulting system do not meet the initial expectations, a side effect is in any case to have people look closely at their data and vocabulary for the necessary semantic sanity check, and this is very useful per se. If you ask me, even more useful than the system implemented, which is bound to have a very short life cycle anyway, as we all know. As Bruno Bachimont uses to say, an ontology is mainly a tool to explicit inconsistencies of our knowledge, pointing to new questions for research. After that, you can throw it away. :-) When the ontology is consistent, you have nothing to discover any more. Trying by all means to hide or wipe out inconsistencies is based on some paranoid logician's view of the world, similar to those who did not want to look into Galileo's telescope because sunspots, Moon's mountains and Jupiter's satellites were not mentioned in Aristote and the Bible Cheers" "Double-encoding on SPARQL endpoint results?" "uHi Something seems to have changed on the SPARQL endpoint at attention to some news announcement somewhere, but The XML coming back from the endpoint when using format=xml seems to have been doubly utf8 encoded. For example, a character such as C3 A5 (letter a with ring) is returned as C3 83 C2 A5 (utf8 for C3 followed by utf8 for A5). or is there something else that I'm doing wrong? Thanks Aran uHi Aran, the dataset hasn't changed, so there must have been a change in the Virtuoso service running the sparql endpoint. @Openlink: Can you shed light on this? Cheers, Georgi uHi Aran/Georgi, We are looking into possible causes and will report back in a bit Best Regards Hugh Williams Professional Services OpenLink Software On 6 Jun 2008, at 11:59, Georgi Kobilarov wrote: uHi Aran, The problem has been resolved, please rerun your queries to confirm. Note the Dbpedia service was updated yesterday as indicated in Kingsley's blog post at: id=13744 Best Regards Hugh Williams Professional Services OpenLink Software On 6 Jun 2008, at 11:46, Aran Lunzer wrote: uHello Hugh Yup, they're behaving much better now. Thank you. My browser couldn't see that address, but I guess the content you're referring to is the same as the top item at Yago inferencing looks powerful; I'll give it a go. Bye for now - Aran uAran Lunzer wrote: Yes :-) The URI above includes a local domain part :-( A common error we make from time to time :-) Kingley" "Help with Getting started and already available answering engine projects" "uHi, I'm a student from India. I am very excited about the future of linked data and how DBpedia can make that dream come true. I've been reading a bit about semantic web from the book \"Semantic web primer \" . I want to create a \"Semantic answering engine\", where I get answers for the questions and not links, which the traditional search engines give. Eg: 1) Who are the musicians who were born in Chennai,India. 2) Where did Abraham Lincoln die? 3) Give me the capital of all states in India Only before a few days, I heard from my friend that there are already a lot of semantic answering engines that are already available. Despite of already available semantic engines I also want to develop my own \"semantic answering engine\". And I want to work on the output of the data, more of data visualization, with graphs and charts using javascript. I think DBpedia is the right place to get started to realize my idea. So, before I jump into learning how it is done, I wanted to know what are the already available projects which implement this. Could someone please tell me if there are projects which give exact answers for the questions like those? Thanks in advance. uHi Prashanth, On 11/18/2012 01:20 PM, Prashanth Swaminathan wrote: AutoSPARQL project [1] lets you ask complex SPARQL queries to DBpedia with low effort. uHi Prashanth, There is a challenge on this topic : The following paper gives a survey on this topic : Vanessa Lopez, Victoria S. Uren, Marta Sabou, and Enrico Motta. Is question answering fit for the semantic web?: A survey. Semantic Web Cheers, Julien" "Hello World" "uHi there, I would like to know about support of edit history in the dbpedia, that would be a key feature. Specifically I would like to try and partition a group of user into two parts, ones that introduce a tag/term to an article and ones who remove it. I have experimented with using git to analyse the wikihistory but it was very difficult on the text level. I was wondering what support dbpedia would have on such an analysis and at what detail level. Sorry that I have not read the fine docs or searched much, I figured that asking might provide the best answer anyway. thanks, mike uHi Mike, the 'dump' version of DBpedia currently does not include code to analyze article history. We usually do not even download the files that include the history of an article, we only us the current version. I think the 'live' version of DBpedia includes some history data, but I don't know much about that. The guys from Leipzig should be able to help. :-) The parser included in DBpedia tokenizes the source code of a Wikipedia page, but is mostly aimed at structured data and does not analyze the text. If I understand correctly, you need some kind of tool that analyzes the textual differences between article versions and groups users by these differences. I am afraid there is not much that DBpedia can help you with. All, please correct me if I'm wrong. Particularly, DBpedia spotlight analyzes the unstructured text - how about that? Regards, Christopher On Tue, Apr 3, 2012 at 22:20, Mike Dupont < > wrote: uOn Tue, Apr 3, 2012 at 10:40 PM, Jona Christopher Sahnwaldt < Correct. I think it would be possible to just look for key words, there are user who will go around and consistently add in one particular word to a set of articles, and others who will remove that and add another. I would be willing to study the history of a select set of articles, even the rdf extracted would be different. there are edit wars over parts of articles that would show up in dbpedia on the top level. I will look into ityes very nice. I am willing to do my own coding, it would be nice to have an infrastructure for doing this. I have looked at the wikparser code in 09 and other things, not up to date. THANK YOU. this is a good start. So I think that this would comprise of a feature extractor that would create rdf triples out of the text that i am looking for and then compare the changes in the triples over time. ideally we would only have to store the fact that a user introduced or retracted some triple and that would be stored as some delta. these are just some ideas. mike" "DBpedia license" "uThe DBpedia license has recently changed. I believe it should be clarified further, in order to not violate the original Wikipedia license. impose any effective technological measures on the Work that restrict the ability of a recipient of the Work from You to exercise the rights granted to that recipient under the terms of the License.* This Section 4(a) applies to the Work as incorporated in a Collection, but this does not require the Collection apart from the Work itself to be made subject to the terms of this License.\" I am not saying that the recently added attribution limitation is in direct violation of this paragraph, but there are definitely parts of DBpedia that it cannot apply to, such as the Work apart from the Collection (not the other way around). The DBpedia license has recently changed. I believe it should be clarified further, in order to not violate the original Wikipedia license. From 4(a): 'When You Distribute or Publicly Perform the Work, You may not impose any effective technological measures on the Work that restrict the ability of a recipient of the Work from You to exercise the rights granted to that recipient under the terms of the License. This Section 4(a) applies to the Work as incorporated in a Collection, but this does not require the Collection apart from the Work itself to be made subject to the terms of this License.' I am not saying that the recently added attribution limitation is in direct violation of this paragraph, but there are definitely parts of DBpedia that it cannot apply to, such as the Work apart from the Collection (not the other way around). uOne might also be inclined to question the legality of the change with respect to contributors under the old license. One such question arose when Wikipedia changed their license. From : \"Are such unilateral changes to a license legal in all jurisdictions where people may wish to re-use our content? - We believe that licensing updates that do not fundamentally alter the spirit of the license and that are permitted through the license itself are legally valid in all jurisdictions.\" However in this case, the added clausal does \"alter the spirit\" and what is permitted, from the old license. 2012/9/23 Emil Müller < > One might also be inclined to question the legality of the change with respect to contributors under the old license. One such question arose when Wikipedia changed their license. From . uEmil, All content provided by DBpedia is extracted from Wikipedia, there are no additional content contributors. So I don't see a problem that DBpedia applies exactly the same licensing terms as Wikipedia. Sören Am 23.09.2012 12:42, schrieb Emil Müller: uHi Emil, could you be so helpful and send me a link to where the licence changed? I couldn't find it. We have made some changes to: Initially, the paragraph sounded like a licence restriction, but within the same day we renamed it to \"best practice for attribution\" . Has the licence changed anywhere else? I think there is nothing wrong with providing best practice recommendations. Quite the opposite, we really would wish Wikipedia would provide such best practices for RDF, so we would be sure how to give proper attribution. We include some links back to the wiki article, but we are unsure, whether this is best practice. Sebastian Am 23.09.2012 12:02, schrieb Emil Müller:" "Questions arised from the mapping marathon" "uHi, On 11/08/2011 12:16 PM, Pablo Mendes wrote: In DBpedia-Live, we reprocess all the changed pages we get from Wikipedia update stream, and we also reprocess the pages that are affected by a mapping change. The pages we get from Wikipedia update stream have higher priority, so they are reprocessed first. So, the pages affected by a mapping may take a few minutes to get reprocessed depending on how many live page are waiting for reprocessing, but it will not take long to appear. Same as the previous issue, it may take a few minutes to appear in DBpedia-Live. I guess that is not too much effort in the parser. I had a look on the ontology wiki and found only \"Organisation\" class and did not find the other one you mentioned." "Inquiry for contribution" "uHello everyone, Could any one let me know how could I contribute to DBpedia especially the idea of checking consistency in ontology. I am good at Java, Python. Thanks in advance. King regards, Ankur Padia. Hello everyone, Could any one let me know how could I contribute to DBpedia especially the idea of checking consistency in ontology. I am good at Java, Python. Thanks in advance. King regards, Ankur Padia." "Infobox Extraction Questions" "uHi, Try the following SPARQL select ?p ?o where{ ?p ?o } It will all the results the dbpedia has on that subject including: ?p = ?o = human error My question is why is the object a text string and not a URI? If you lookup a disease, say Malaria, you can find its cause in the abstract as text but you cannot ask the question what causes Malaria and have it answered in dbpedia. I know these are difficult to extract but has anyone thought about doing this? Why is cause in the infobox of the train wreck but not in that of the disease? In Malaria the infobox points to a diseasedb reference (7728) can I follow that somehow to get the cause of malaria as a URI, and can that URI be used usefully with dbpedia? Another question is that why cant I find the train wreck triple when I do the following: grep The_Great_Train_Wreck_of_1856 infobox_en.nt In fact I cant find it (the train wreck cause triple) in any of the files (I am looking at files i downloaded 2 months ago and querying dbpedia live but has this cause property been added to the infobox this recently?) Thanks! Marv" "Making Your LInked Data Discoverable" "uAll, Over the years there has been a subtle \"blind spot\" re. Linked Data that ultimately leads to many a debate thread about Linked Data utility and applications etcThus, I would like to outline a few simple best practices that would go a long way to putting these reoccurring issues to rest, once and for all. Whenever you publish an (X)HTML based Linked Data interface please try to achieve at least one of the following: 1. Use Linked Data URIs in @href attribute of the tag associated with the literal values that typically label the subject, predicate, or object of an RDF triple. 2. Use the tag's @rel attribute to associate the (X)HTML document with Linked Data URIs of entities it describes or simply mentions (casually or topically). 3. If you have server access and admin level privileges, repeat #2 on the server side using the \"Link:\" header in HTTP responses . Irrespective of how subjectively beautiful or ugly an (X)HTML interface might be re., Linked Data display, the utility of your Linked Data is ultimately comprised if it cannot be discovered, shared, and easily referenced to by other Linked Data tools. Remember, this is all about webby structured data that leverages the RDF data model u+1 to a best practice guide i was also unaware of the \"Link\" response header until today. came across it as of this morning in connection with powder [1]. wkr [1] On Wed, 2013-01-23 at 15:09 -0500, Kingsley Idehen wrote: u0€ *†H†÷  €0€1 0 + uOn 1/24/13 4:53 AM, Leigh Dodds wrote: With regards to this snippet from [1]: The \"rev\" parameter has been used in the past to indicate that the semantics of the relationship are in the reverse direction. That is, a link from A to B with REL=\"X\" expresses the same relationship as a link from B to A with REV=\"X\". \"rev\" is deprecated by this specification because it often confuses authors and readers; in most cases, using a separate relation type is preferable. Here is my response to Mark Nottingham (circa. 2010) about this matter [2]. Basically, if it doesn't produce markup that makes browsers balk, we (LOD community and practitioners) can simply own it for our specific purposes. Ultimately, if need be, we make the case for its resurrection to those who do currently understand its utility. Links: 1. rfc5988#page-8" "Non English Articles (with no Englishequivalent)" "uHi Neil, that is still correct for the newest DBpedia release. Only data from articles which have an English equivalent is extracted. I personally think it would be useful to mint URIs for resources in all languages and interlink them if possible. Next release maybe Cheers, Georgi From: [mailto: ] On Behalf Of Neil Ireson Sent: Monday, February 18, 2008 4:24 PM To: Subject: [Dbpedia-discussion] Non English Articles (with no Englishequivalent) Hi all, Hopefully a simple question equivalent then this article does not appear in DBPedia, can any one inform me if this is correct for version 3.0? N She said what? About who? Shameful celebrity quotes on Search Star!" "DBpedia srategy survey" "uDear DBpedians, Sören Auer and the DBpedia Board members prepared a survey to assess the direction of the DBpedia Association. We would like to know what you think should be our priorities and how you would like the funds of the association to be used. *Your opinion counts* – so please contribute actively in developing a better DBpedia. We kindly invite you to vote here: _ We will publish the results in anonymized, aggregated form on the DBpedia website. We are looking forward to your input. All the best, Julia" "Update: Guides for using Microsoft PivotViewer with SPARQL endpoints" "uAll, An update to what I sent out yesterday. There are also two FAQs feeding off comments and discussions I've had with people offline. For example, why do some of my demos take so long to run etcThe answer lies in understanding the degree to which SPARQL and PivotViewer have been integrated re. dynamic CXML generation. Example: I have a demo that uses the SPARQL endpoint at RPI as part of a 7-day earthquake tracking meshup [1]. The SPARQL in question is SPARQL-FED, and I've set no LIMIT on the resultset" "Computing the "open" dbpedia KB" "uHi everyone, I was wondering if anyone has gone to the trouble of computing the \"open\" KB, as opposed the full closure which is provided in the downloads. Essentially, I'm looking for a version of dbpedia with all the redundancy removed. For example, the following four facts are in \"instance_types_en.nt\": . . . . However, the first three facts are implied by the fourth using the dbpedia ontology (i.e., a President is-a politician, which is-a Person, which is-a Thing). Before I go and write my own tool to produce this data set I thought I'd check to see if anyone has either a tool or a download I could use. Thanks, Jeff Pound uOn 1/17/11 4:05 PM, Jeff Pound wrote: You can use inference rules to deal with redundancy and other issues re., DBpedia e.g. property name changes in the latest cut [1]. Hopefully, we'll get some notes with examples in the coming days. In the meantime if you go to , enter a text pattern e.g., \"Lincoln\", hit enter, and click on the \"options\" link in the navigation DIV, to see the existing inference rules[2]. Links: 1. full" "Languages as resources : launching lingvoj.org" "uOn 3 Aug 2007, at 12:01, Bernard Vatant wrote: All rdf:type statements with categories in the object position (like 2 and 6) are in the dataset due to a (now fixed) bug in the extraction process and will be gone when the next dump is loaded into the store. I think 5 is clearly bogus too, but I'm not sure if that is fixed yet. I don't know if submitting plain RDF instance data to an OWL reasoner will do any good in general. It will either blow up or not tell you anything of interest. I'm happy to inform you that this has happened already. They are concepts. Best, Richard" "Making human-friendly linked data pages more human-friendly" "uMatthias Samwald wrote: Pubby isn't how DBpedia is published today. It is done via Virtuoso (been so for quite a long time now), which has in-built Linked Data Publishing/Deployment functionality [1]. You can tweak the HTML template and just send it to us. BTW, the URIBurner [2] pages which also use exactly the same Linked Data Deployment functionality behind DBpedia also have a slightly different look and feel. That can be applied to DBpedia in nano seconds. Not seen as criticism, just a wake up call. On our part (OpenLink) we've always sought to draw a small line between OpenLink branding and the more community oriented DBpedia project. Thus, our preference has been to wait for community preferences, and then within that context apply updates to the project, especially re. aesthetics. Links: 1. VirtDeployingLinkedDataGuide.html uAdrian Walker wrote:" "Are you guys pinging the semantic web?" "uHi thereis there any automatic mechanism to inform ping the semantic web of the pages you have ? If you dont, i vote +100 on this feature to be added asap (its a body mass weighted vote, by which i just count 3 times richard for example) :-) see you guys soon in berlin :-) Giovanni" "Please help test the German Chapter Live Endpoint" "uHi, After the great meeting we had last week in Amsterdam, I decided to burn the midnight oil and to turn the German chapter live. It is online for testing purposes, I'm considering of making it the default endpoint when I'm sure about it's stability. You can access the live Sparql endpoint under: I would really appreciate if someone would spare the time and check if the changes in the German Wikipedia actually get reflected in the live version of the German chapter. I haven't been able to notice any errors until now but I'm sure some issues will pop up if more people use it. Beware (there be dragons), we have some known issues: the dbpedia virtuoso plugin is kind of a hack. In the live version of the resource pages are generated from the live data, but the links in the page will still lead you too the old resource pages (you'll know what i mean if you look at some resource pages). I'll fix this soon, I need to recompile Virtuoso for every change i make in a vad package and it takes forever. It would be really nice if the Dutch chapter would share their new dbpedia_vad package, or if someone would tell me a more efficient way to compile the virtuoso vad packages. Cheers, Alexandru Alexandru Todor Freie Universität Berlin Department of Mathematics and Computer Science Institute of Computer Science Königin-Luise-Str. 24/26, room 116 14195 Berlin Germany Hi, After the great meeting we had last week in Amsterdam, I decided to burn the midnight oil and to turn the German chapter live. It is online for testing purposes, I'm considering of making it the default endpoint when I'm sure about it's stability. You can access the live Sparql endpoint under: if the changes in the German Wikipedia actually get reflected in the live version of the German chapter. I haven't been able to notice any errors until now but I'm sure some issues will pop up if more people use it. Beware (there be dragons), we have some known issues: the dbpedia virtuoso plugin is kind of a hack. In the live version of the resource pages are generated from the live data, but the links in the page will still lead you too the old resource pages (you'll know what i mean if you look at some resource pages). I'll fix this soon, I need to recompile Virtuoso for every change i make in a vad package and it takes forever. It would be really nice if the Dutch chapter would share their new dbpedia_vad package, or if someone would tell me a more efficient way to compile the virtuoso vad packages. Cheers, Alexandru Alexandru Todor Freie Universität Berlin Department of Mathematics and Computer Science Institute of Computer Science Königin-Luise-Str. 24/26, room 116 14195 Berlin Germany" "Problems with Italian Ontology Infobox Types" "uHi all, also if I know dbpedia from many years, this is the first time I post something in the mailing list, I hope this is right. I've found the lacking of many information in the i18n dataset regarding Italian language, which I downloaded at this URL: There are very few classes and instances, e.g. there aren't individuals for the Organization class. I'm asking where is the problem, and how can I help in mappings for italian language, since it's my first language. Thank you, Riccardo uCiao Riccardo, Thanks for offering help. Please register at mappings.dbpedia.org and send us your username. After we give you rights, you will be able to map infoboxes of Organization pages to their corresponding Class/Properties in the DBpedia Ontology. You can then check out how many mappings exist for Italian, and how it compares with the other languages here: Http://mappings.dbpedia.org/sprint/ Arrivederci, Pablo On Feb 15, 2012 2:59 PM, \"Riccardo Tasso\" < > wrote: uHere is my first mapping (the one for which I started to give my contribution): If anyone can give it a check, italian or not, it would be very appreciated. Another question: why the \"Test this mapping\" link doesn't work? Thank you, Riccardo 2012/2/16 Pablo Mendes < > uOn 17 February 2012 23:55, Riccardo Tasso < > wrote: Try 'View source' :) uHi Riccardo, looks great! The \"Test this mapping\" button works in Firefox and Internet Explorer, but not in Chrome. (I didn't test other browsers.) Probably an XML namespace issue. We'll try to fix that soon. Cheers, Christopher On Sat, Feb 18, 2012 at 00:55, Riccardo Tasso < > wrote: uVery good :-) It's a quite annoying work, but it gives me a great satisfaction. Let see if I'll find thed time to contribute again. Cheers, Riccardo Il giorno 18/feb/2012 01:24, \"Jona Christopher Sahnwaldt\" < > ha scritto: u\"Test this mapping\" now also works in Chrome. Happy mapping and testing, Crome users! Gory details: the Jersey HTTP server threw an exception because Chrome sent an \"Accept\" header that our stylesheet content-type was not compatible with. Fixed in On Sat, Feb 18, 2012 at 01:24, Jona Christopher Sahnwaldt < > wrote:" "Simple Infobox_Disease query" "uHi there, I am trying to design a simple query to return a list of the 4,600 disease infoboxes that are part of dbpedia. However, I cannot seem to retrieve more than 320 diseases, which is somewhat frustrating! My query is as follows: ?name dbpedia2:wikiPageUsesTemplate < ?name dbpedia2:meshid ?meshid . } I'm sure it's just a simple error that I'm making. Any help would be very much appreciated! Thank you very much in advance, Best wishes, Chris Hi there, I am trying to design a simple query to return a list of the 4,600 disease infoboxes that are part of dbpedia. However, I cannot seem to retrieve more than 320 diseases, which is somewhat frustrating! My query is as follows: Chris uOn 9/13/10 7:01 PM, Christopher Kelly wrote: Try: sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&should-sponge;=&query;=PREFIX+owl%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0D%0APREFIX+xsd%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23%3E%0D%0APREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0APREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0D%0APREFIX+dc%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%3E%0D%0APREFIX+%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2F%3E%0D%0APREFIX+dbpedia2%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2F%3E%0D%0APREFIX+dbpedia%3A+%3Chttp%3A%2F%2Fdbpedia.org%2F%3E%0D%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0D%0A%0D%0ASELECT++%3Fname+%3Fname+%3Fmeshid+WHERE+%7B+%0D%0A+++++%3Fname+dbpedia2%3AwikiPageUsesTemplate+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FTemplate%3AInfobox_Disease%3E+.%0D%0A+++++%3Fname+dbpedia2%3Ameshid+%3Fmeshid+.%0D%0A%7D+&format;=text%2Fhtml&debug;=on&timeout;= uHi, the queries seem to be identical, what is the difference? Using sparql instead of snorql. It has a double ?name variable, which doesn't seem to add a lot. Actually this query brings a little bit more: SELECT count(*) WHERE { ?name dbpedia2:wikiPageUsesTemplate ?infobox . ?name dbpedia2:meshid ?meshid . Filter (?infobox IN ( , ) ). } @Christopher The problem is that Template: values are stored as the appear in the Wikipedia page, which has good reasons (once you normalize them, you can't go back). So you might need to find more variants. How did you find the info that there are 4600 infoboxes? The overview WhatLinksHere also includes all Templates that redirects to Infobox_disease . Cheers, Sebastian Am 14.09.2010 03:30, schrieb Kingsley Idehen: uOn 9/14/10 1:38 AM, Sebastian Hellmann wrote: The point was about using sparql endpoint rather than snorql . Maybe typo, since it was cut and paste once the count exceed the limits claimed etc Yes, I did the count before sending the sparql endpoint link, as per comments above. Kingsley uThank you very much indeed for your help Kingsley and Sebastian! Very much appreciated. The 4600 quote was found at \"The DBpedia knowledge base currently describes more than 3.4 million things, including 4,600 diseases\". I presume that the majority of this info is from the wikipedia infoboxes, hence thought this would be a good way of filtering? So I was just trying to get a handle on as much of this as possible for a medical education project (for which this kind of data - linking disease names / mesh / emedicine / wikipedia etc is just ideal!). I wonder if the other 3900 diseases (I'm getting 947 diseases from the \"count\" query below?) are not from the infoboxes then? Thanks again, Best wishes, Chris On Tue, Sep 14, 2010 at 4:21 PM, Kingsley Idehen < >wrote: uAh ok, there lies the hare in the pepper. The query you assembled gives you all articles using the certain infobox AND having the property meshid. For diseases you could use this: SELECT count(*) WHERE { ?s a } (you can register to edit the Wiki) Another way might be to try this tool, we at AKSW created: it will be released in a few days. We are working on some bugs and will add some features (such as exporting the sparql query) You can try it, if you want, we are creating manual and tutorial and bugfixes, so please have patience, All saved concepts will be lost, because we have to upgrade it. Cheers, Sebastian Am 14.09.2010 21:34, schrieb Christopher Kelly: uAh great - thanks - makes sense. Still learning SPARQL hereapologies. Just one final question (sorry), if I wanted to adjust your query below to display \"dbpedia2:meshid\" values if a disease has it stored (but not to exclude them if it's missing, like I did originally), how would I adjust the query? SELECT count(*) WHERE { ?s a } Thank you once again! Best wishes, Chris On Tue, Sep 14, 2010 at 9:20 PM, Sebastian Hellmann < > wrote: uHello Christopher, I created a site here: It is quite new so you haven't found it before and nobody can blame you for asking the list. Could you do me a favor and post your question on semanticoverflow.com or stackoverflow.com and tag it dbpedia. (this is not to annoy you but rather a good way that will help others to find answers and help) If you do not want to: the answer is to include OPTIONAL {?s dbpedia2:meshid ?meshid} Cheers, Sebastian Am 14.09.2010 23:32, schrieb Christopher Kelly: uFantastic Sebastian - thanks once again for all your help. I will definitely check out the other sites that you suggest. Best wishes, Chris" "Norwegian (bokmÃ¥l) infobox extraction" "uIf implemented in the near future I would like to use the Norwegian infobox extraction feature for my master thesis. By doing some queries and looking at dbpedia NextSteps ( infobox extraction is already in place. Swedish and Norwegian language share a lot of common features. Is anyone working on Norwegian infobox extraction at the moment? This might be a task where I can contribute, but is there any way to get an overview of what has been done and what needs to be done for Norwegian infobox extraction to work? Best regards, Jeanette If implemented in the near future I would like to use the Norwegian infobox extraction feature for my master thesis. By doing some queries and looking at dbpedia NextSteps ( Jeanette" "Powerset: Semantic extraction from Wikipedia based on natural language" "uPowerset is a new natural language search engine. In their demo, they are extracting structured information from Wikipedia not from infoboxes, but from the article text! Here's a short intro: Powermouse is a window into Powerset’s natural language index. When Powerset reads sentences in Wikipedia, we go from open text to representations of meaning. In other words, we take text and turn it into structured “facts.” When users enter a query into Powermouse, they’ll be able to browse the “facts” stored in our index. In the example below, when wrestling star “Hulk Hogan” is entered in the first “something” box, users can see all of the facts we’ve indexed about him. Now, if you add “defeat” into the connection box, users see all of the facts that we’ve indexed from Wikipedia about wrestlers that Hogan has defeated. Here’s a Hulk Hogan screenshot. In addition to showing off the power of our index, Powermouse also shows a different type of interface that’s possible with a natural language index. Here are some example queries from their site: \"what did Steve Jobs say about Apple?\" \"who criticized Google?\" ”who acquired Peoplesoft?” ”who manufactures CPUs?” \"what did Mozart compose?\" \"what did Andy Warhol paint?\" \"who depicted urban life?\" I shamelessly attached some screencasts from the beta section. Cheers, Christian DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \" Document uChristian Becker wrote: Christian, This is all well and good, but are the Structured Facts available in an open format that enables 3rd parties interact with the resulting Data Graph persisted in Powerset? This is what we referred to as Linked Data where the Facts are serialized using RDF and entry points exposed via URIs. Thus, can I dereference facts via a Powerset URI using RDF for instance? Or am I confined to working with Web Services APIs provided by PowerSet? uOn Nov 6, 2007, at 2:47 PM, Kingsley Idehen wrote: Kingsley, as you probably guessed, no ;) I don't see them going there either, as this is just a demo. I just thought that this is an interesting development that may be of interest at some point. If we ask them, they might share a dump with DBpedia, which we could then turn into RDF and publish as Linked Data. GNU FDL discussions anyone? ;) Cheers, Christian" "template parsing bug" "uHello, There is an issue with SimpleWikiParser in extraction framework regarding template parsing. Strangely formatted templates like this one: {{template | value |= }} are not parsed as templates nodes but text nodes instead. Apart from preventing data extraction it results in incorrect abstracts on Polish Dbpedia. For example on parameter values. BTW, I noticed a couple of issues I when trying to report this issue. 1) I couldn't submit a bug on SourceForge at permission denied error. Is there any reason to restrict bug reporting to project members only? 2) I wanted to created a test case for it but I couldn't find any tests for the parser part in the repository. Are there any? Regards, Piotr uAny thoughts on this? I wrote some test cases and a fix that I can contribute in case you are interested. Piotr On 2012-09-06 01:13, Piotr Jagielski wrote: uHi Piotr, Any contribution is always welcome! However, the case you are referring seems strange. Abstracts are not generated by the SimpleWikiParser, they are produced by a local wikipedia clone using a modified mediawiki installation. Best, Dimitris On Mon, Sep 10, 2012 at 7:30 PM, Piotr Jagielski < >wrote: uDimiris, I guess I'm confused about the project structure. I looked at AbstractExtractor.scala. It clearly uses PageNode to figure out what the abstract is and I figured out that PageNode is created by SimpleWikiParser. I now see that there is some PHP code for a lot of stuff including abstract extraction. I don't understand the relationship between Scala extraction framework and PHP code and I'm wondering if you mean the latter when you refer to \"modified mediawiki installation\". When I used AbstractExtractor.scala to generate the abstract for of a strangely formatted template not parsed correctly. Anyway, I can now access the bug tracker so I will submit a patch there. Regards, Piotr On 2012-09-11 08:39, Dimitris Kontokostas wrote: uHi Piotr, We will happily accept you patch :) You can take a look at [1] & [2] for more details on abstract extraction. Best, Dimitris [1] [2] On Wed, Sep 12, 2012 at 10:37 PM, Piotr Jagielski < >wrote: uThis question keeps coming up, so I added hints to the documentation. 4.3. Running Abstract Extraction Cheers, Pablo On Thu, Sep 13, 2012 at 7:13 AM, Dimitris Kontokostas < >wrote: uOK, I submitted a bug with proposed fix and test cases at Thanks for the link to documentation. Now I know where the confusion came from. I should have mentioned that I tweaked the code locally a little bit in order to generate abstracts without a local MediaWiki instance :-) I used SimpleWikiParser to create PageNode to pass to AbstractExctractor. The issue is in SimpleWikiParser. Piotr On 2012-09-13 11:51, Pablo N. Mendes wrote: uHi Piotr, Thank you for the patch, Although it catches an error case, it seems safe to be included in the framework. About the PageNode Abstracts, can you give us a quality feedback? It is something we always wanted to test but couldn't find the time. Best, Dimitris On Fri, Sep 28, 2012 at 5:57 PM, Piotr Jagielski < >wrote: uI haven't done extensive tests but one thing to improve for sure is the abstract shortening algorithm. You currently use a simple regex to solve a complex problem of breaking down natural language text into sentences. java.text.BreakIterator yields better results and is also locale sensitive. You might also want to take a look at more advanced boundary analysis library at Regards, Piotr On 2012-10-01 07:42, Dimitris Kontokostas wrote: uOur main interest is the text quality, if we get this right the shortening / tweaking should be the easy part :) Could you please give us with some text quality feedback and if it is good maybe we can start testing it to other languages as well Best, Dimitris On Tue, Oct 2, 2012 at 11:11 PM, Piotr Jagielski < >wrote: uWhat do you mean by text quality? The text itself is as good as the first couple of sentences in the Wikipedia article you take it from, right? Piotr On 2012-10-02 22:49, Dimitris Kontokostas wrote: uOn Wed, Oct 3, 2012 at 12:42 AM, Piotr Jagielski < >wrote: Well, that is what I am asking :) Is it (exactly) the same text? The problem is with some templates that render text (i.e. date templates) If we can measure their usage extend we could see if this is the way to go. Best, Dimitris uPerhaps it would help the discussion if we got more concrete. Dimitris, do you have a favorite abstract that is problematic (therefore justifies using the modified MediaWiki)? Perhaps you can paste the wiki markup source and the desired outcome and Piotr can respond with the rendering by his patch. On Oct 3, 2012 8:31 AM, \"Dimitris Kontokostas\" < > wrote: wrote: first couple of sentences in the Wikipedia article you take it from, right? If we can measure their usage extend we could see if this is the way to go. shortening / tweaking should be the easy part :) good maybe we can start testing it to other languages as well wrote: the abstract shortening algorithm. You currently use a simple regex to solve a complex problem of breaking down natural language text into sentences. java.text.BreakIterator yields better results and is also locale sensitive. You might also want to take a look at more advanced boundary analysis library at >>>> uI don't have a concrete test-case, I have to search in blind. What I was thinking is that if we could create the abstracts with exactly the same way as the modified mw we could make a string comparison and test how many are different and how. Depending on the number and frequency of the text rendering templates that exist in the abstracts result we could try to resolve them manually. Removing the local Wikipedia mirror dependency for the extraction could be a huge plus but we shouldn't compromise on quality. Any other ideas? Best, Dimitris On Wed, Oct 3, 2012 at 9:41 AM, Pablo N. Mendes < >wrote: uI have searched a bit through the list and only found an example in Italian. *Article:* *Rendered text:* Vasco Rossi, anche noto come Vasco o con l'appellativo Il Blasco[7] (Zocca, 7 febbraio 1952), è un cantautore italiano. *Source:* {{Bio |Nome = Vasco |Cognome = Rossi |PostCognomeVirgola = anche noto come '''Vasco''' o con l'appellativo '''''Il Blasco''''' [ Vasco Rossi torna a giugno. Il Blasco piace sempre] archivio.lastampa.it |Sesso = M |LuogoNascita = Zocca |GiornoMeseNascita = 7 febbraio |AnnoNascita = 1952 |LuogoMorte = |GiornoMeseMorte = |AnnoMorte = |Attività = cantautore |Nazionalità = italiano }} If you could compare the output for both solutions with a few such pages, we could have an initial assessment of \"text quality\" as Dimitris put it. Cheers, Pablo On Wed, Oct 3, 2012 at 9:30 AM, Dimitris Kontokostas < >wrote: uI completely misunderstood what you were saying. I thought that you asked me for abstract generation quality feedback in general. Now I realized that you are referring to the fact that I generated abstracts without a local MediaWiki instance. What I did however may be different from what you suspect though. Here's what I did: - I saw that you invoke api.php of local MediaWiki instance to parse wiki text. I didn't bother to set it up so I just replaced the URL with actual Wikipedia instance of the language I worked on. This caused the wiki text to be rendered with templates substituted. - After this modification I parsed wiki text from XML database dump using SimpleWikiParser and passed the PageNode to getAbstractWikiText method in modified AbstractExtractor - I saw that the returned text contains HTML markup so I removed it using an HTML sanitizer. I assumed that you use \"modified\" MediaWiki to cover this part but I wasn't sure. - I was not happy with short method in AbstractExtractor because it didn't recognize sentence boundaries correctly. I created my own shortening routine using java.text.BreakIterator with additional abbreviations checks. From what you're saying below I suspect that you are interested in generating abstracts without a need to invoke MediaWiki neither locally nor remotely. That I haven't tried to do. Sorry for the confusion but I'm very new to all this and I'm just trying to use some of the extraction framework code for my purposes. Are we on the same page now? Regards, Piotr On 2012-10-03 10:08, Pablo N. Mendes wrote: uI think you did exactly that with an unnecessary call to wikipedia. The PageNode is a parameter to the AbstractExtractor.extract so you could call that directly. The patched mw is here: I was thinking we have 2 future tasks regarding this 1) Create an \"abstract\" mediawiki extension and get rid of the patched old mediawiki 2) See if a wikitext2text approach works (what you tried to do) You could use the shortening function from the mw code and then maybe contribute your code back ;-) Best, Dimitris PS. anyone else from the community that has some time to implement #1 is welcome On Wed, Oct 3, 2012 at 9:58 PM, Piotr Jagielski < >wrote: uI'm interested in exploring wiki text to text approach. I'm just wondering what's your idea to do it without either local MediaWiki or remote Wikipedia call. Do you have any code in extraction framework that can be used to parse wiki markup? I tried toPlainText() on PageNode but it appears only to replace links but it keeps all wiki formatting like bold, italics, lists etc. Regards, Piotr On 2012-10-04 08:16, Dimitris Kontokostas wrote: u+1 for a standalone WikiText to Text function. For example, being able to generate a plain text version of a page without a call to a local MediaWiki instance (or to Wikipedia itself) simplifies drastically the porting of the DBpedia extraction framework (aka DEF) onto distributed platform such as Hadoop, because MediaWiki instances don't have to be shipped and set up on each node at each run. It also make DEF lighter, and easier to install on a single box, whatever its size. Nicolas. On Oct 4, 2012, at 2:12 PM, Piotr Jagielski < > wrote:" "query timeout error." "uI am trying to run a query through a java program using Jena ARQ API. Following is the query that I run through a java program. select distinct ?s ?o where { ?s ?o1 . ?o1 ?o . ?s ?o .} sometimes this works and sometimes doesn't. I do not know the problem. Even in the sparql endpoint, sometimes it doesn't work. The exception I get from java program is HttpException: 500 SPARQL Request Failed. In sparql endpoint it says timeout. I increased the timeout in the web interface but it doesn't help. I want to run this query for a list of properties and the code breaks when it executes the above query since it returns an exception. Do you have any clue what is going on? I am trying to run a query through a java program using Jena ARQ API. Following is the query that I run through a java program. select distinct ?s ?o where { ?s < on? uHI, Assuming you are running your query against the endpoint, could you retry your query and let me know if it now works. Patrick uYes. I am using the sparql end opoint of DBpedia. Now the code and the endpoint both works. What was the problem by the way? One other thing, how can I catch the exception for querying and issue the query again and again? I have a simple loop for that but when an exception is caught, it seems like the query is never attempted again and just keep throwing the exception. this is regarding Jena ARQ API. If anybody knows how to increase the timeout or keep the loop going for the purpose please help. Following is the simple loop I have (in java). Thank you. while (cond) { try { SparqlQuerying.selectQueryExecution(query); cond = false; } catch (Exception e) { System.out.println(\"Exception occurred 1 \" + e.getStackTrace()); e.printStackTrace(); Thread.sleep(1000); } } selectQueryExecution(query) - just executes the query with the sparql endpoint of DBpedia. When \"HttpException: 500 SPARQL Request Failed.\" is caught, it didn't seem like querying again but just throwing the same exception in quick succession. Kalpa Gunaratna uThe same exception what happened earlier continues now. I have the list of object properties to run queries against using a java program and for some queries it gives me timeout exception. One example is the same previous thread example. Another example query is attached here as follows. select distinct ?s ?o where { ?s < Gunaratna uHi Patrick, It was working last week but again not working for couple of days. I am using Kalpa Gunaratna uHi Kalpa Gunaratna, This appears to be a server side issue and we are currently looking into the cause. Please check as it should work at this time. Patrick uHi Patrick, It was working yesterday for the most of the object properties in DBpedia (except for 5 of them) and today it seems like working for very few object properties. A simple transitive restriction check fails for example on Both Dbpedia sparql endpoint in browser and also API calls using Jena ARQ fail to get any result. Is there any other alternative instead of this sparql endpoint that we can try? Thank you for your concerns. Kalpa Gunaratna" "How to best choose and use dbpprops for public use in radial tree layout app" "uHello, I have a question about using DBpedia properties. I'm working on an iPad app for visually browsing Wikipedia with radial tree layouts. My colleagues and I published a proof-of-concept app called 'WikiNodes': We want to improve the layout and content of the app using DBPedia data. For example, for a primary node \"Paris\", a child node could be \"Birthplace of\", and then we'd list ~10 of the people with dbpedia-owl:birthPlace. Where/how can we find some guidance on DBpedia properties, and whether to use them? uOn 8/29/12 11:22 AM, Michael Douma wrote: I am certainly interested in your enhancing this app via DBpedia. One critical item to note, do keep the DBpedia URIs accessible to humans and machines (via @hrefs that drives hyperlinks) this keeps the Linked Data Web flowing etcas folks share and annotate data. uOn Wed, Aug 29, 2012 at 11:22 AM, Michael Douma < > wrote: Is there more than one property used for this information? That sounds like a very low number of instances. By comparison, Freebase has 3583 people born there (which is still probably just a tiny fraction of those in Wikipedia). uHi Michael, Tom, Interesting product/project I added a few comments inline below. Cheers. By experience with similar projects/products at Yahoo! , you don't want to show all the possible related entities, but only the most important ones. Ranking them is important so the user is not overwhelmed. Also, you have limited space or displaying them. That's probably want was meant here with this selection of ~10 persons born in Paris. u@Nicolas, Yes, we are limiting sets. We have a radial layout, with limited screen real estate, so we have no intention to display 1000's of nodes. Just a dozen max per category should be enough for people to have something interesting to explore. We may use Wikipedia traffic volume to choose which subsets to display. @Tom: Thanks for your thoughts about comparing DBpedia and Freebase. Sorry our app is only iOS now. Yes, all browser-type apps (e.g., Wikipedia) are now 17+ rating. I'm all about free speech, but I'm actually supportive of parents/schools having a practical way to limit downloads that can display objectionable content. In practice I hope this is only enforced for young kids. Apple need policies which apply broadly, and it's not fair to hold Wikipedia to a different standard than any other broad-topic information database or browser. We actually have an additional profanity filter, but that did not help us get our 17+ rating lowered. @Kingsley: Our app will work offline, so we will need to create packages. We probably will not send users to DBpedia in realtime. Also, we are merging and massaging results. e.g., to filter from 1000's of hits to the top 10. @Anyone: If you'd like to see our WikiNodes app, and you have an iPad, send me a note off this list, and I can send you a promo code. This would be for the old/current Wikipedia app. Our new app is a bit different. Tom Morris wrote: uOn 8/29/12 11:53 PM, Michael Douma wrote: My comments aren't about the on or offline modality of your app. Its all about keeping the Web of Linked Data intact. For instance, when sharing insights, you should use DBpedia and Wikipedia URIs. Right now you only use Wikipedia URI. Most important of all, you tool is a nice annotation mechanism, and it would benefit immensely from using DBpedia URIs for fine-grained structured annotations that contribute by to the Web of Linked Data just by using triples (pick a format that works best for you) that leverage DBpedia URIs. uOn Thu, Aug 30, 2012 at 7:58 AM, Kingsley Idehen < > wrote: Isn't there a 1-to-1 mapping between Wikipedia & DBpedia URIs? That should allow one to easily map from a Wikipedia URI to a DBpedia URI. Tom uOn 8/30/12 9:14 AM, Tom Morris wrote: Tom, Yep! Thus, I would expect the app. to put this to good use :-) uOn 8/30/2012 6:19 AM, Kingsley Idehen wrote: And if you're interested in how the mapping works, here's the link: cheers, roberto uAt Yahoo, for filtering and ranking nodes and relationships in this type of knowledge graphs, we have been using co-occurrence in Web Search queries, Flickr tags (especially for Places), Web pages, and tweets, and various time frames for taking trends and buzziness into account. Overall, the best ranking factor is co-occurrence in Web Search queries. Wikimedia query logs could be of huge help here if you can get access to some. Otherwise, you can still use Wikipedia page view statistics to order related entities by popularity, and can even take trends into account. On a separate note, the DBpedia code is open source. Digging into it is useful for understanding DBpedia AND Wikipedia data Nicolas. PS: I have read about WikiNodes haven't tried it. Would be happy too :) On Aug 29, 2012, at 8:53 PM, Michael Douma < > wrote: uOn 8/30/2012 5:51 PM, Nicolas Torzec wrote: A couple of year ago we published some papers about how to \"rank\" pairs of DBpedia resources according to their relatedness. And for doing that we leverage on search engines (Google, Yahoo!, Bing), on a popular social tagging system (Delicious), on Wikipedia information and on the DBpedia graph structure. Here's a couple of papers about that: 1) Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Ranking the Linked Data: The Case of DBpedia. Web Engineering. Lecture Notes in Computer Science Volume 6189, 2010, pp 337-354 2) Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic Wonder Cloud: Exploratory Search in DBpedia. Current Trends in Web Engineering. Lecture Notes in Computer Science Volume 6385, 2010, pp 138-149 If anyone is interested I can send them the PDF. regards, roberto" "dbpedia-owl properties sparlq endpoint" "uHi all, I'm playing with the dbpedia sparql endpoint in order to extract the new dbpedia-owl:properties Nonetheless, it works fine with some resources but it doesn't work with others. So for example it works but it doesn't work Why? Am I doing something wrong? Thanks very much Davide Hi all, I'm playing with the dbpedia sparql endpoint in order to extract the new dbpedia-owl:properties Nonetheless, it works fine with some resources but it doesn't work with others. So for example Davide uHi Davide, the Wikipedia article infobox, and hence no (infobox)data for $p $o . the stuff you see at URI ?p ?o AND ?s ?p URI Hope that helps. Georgi uHi Davide Because there is none such properties for Theater: So I guess this is normal that you get an empty resutlset for that sparql query :) Take care, Fred" "DBpedia 2015-04" "uHi all! I’ve got couple of question regarding recent data set: 1. 1. Is this (2015-04) a final dump of DBpedia data? Currently there are no interlinks to other data sets ( useful for our project which performs mapping of customer vocabulary to LOD. 2. 2. Will future dumps be compressed by gzip instead bzip2 as well? 3. 3. Link to current version ( has been removed. Is there any chance it will be available again? This makes automated updates much more easier. Even though this happens once a year. Apologies, if asking in the wrong discussion! Thanks in advance, Mykhailo Hi all!  I’ve got couple of question regarding recent data set:  1.        1. Is this (2015-04) a final dump of DBpedia data? Currently there are no interlinks to other data sets ( customer vocabulary to LOD. 2.  2. Will future dumps be compressed by gzip instead bzip2 as well? 3.        3. Link to current version ( has been removed. Is there any chance it will be available again? This makes automated updates much more easier. Even though this happens once a year. Apologies, if asking in the wrong discussion! Thanks in advance, Mykhailo uHi Mykhailo, You got the right discussion. This is the to-be-announced next DBpedia release but I would consider it unstable/incomplete until it is actually announced regarding your questions, we will recompress the dumps in bz2 when ready, fill the links directory and update the 'current' link Cheers, Dimitris On Mon, Jun 15, 2015 at 2:33 PM, Mykhailo Drozdov < > wrote: uThanks Dimitris! I'll wait for official release! 2015-06-15 14:33 GMT+03:00 Mykhailo Drozdov < >: couple of question regarding recent data set:  1.        1. Is this (2015-04) a final dump of DBpedia data? Currently there are no interlinks to other data sets ( customer vocabulary to LOD. 2.  2. Will future dumps be compressed by gzip instead bzip2 as well? 3.        3. Link to current version ( has been removed. Is there any chance it will be available again? This makes automated updates much more easier. Even though this happens once a year. Apologies, if asking in the wrong discussion! Thanks in advance, Mykhailo uI've got one more question: will DBpedia 2015 have Freebase links or they are completely removed due to transition to Wikidata? Thanks, Mykhailo 2015-06-15 21:42 GMT+03:00 Mykhailo Drozdov < >: couple of question regarding recent data set:  1.        1. Is this (2015-04) a final dump of DBpedia data? Currently there are no interlinks to other data sets ( customer vocabulary to LOD. 2.  2. Will future dumps be compressed by gzip instead bzip2 as well? 3.        3. Link to current version ( has been removed. Is there any chance it will be available again? This makes automated updates much more easier. Even though this happens once a year. Apologies, if asking in the wrong discussion! Thanks in advance, Mykhailo uWe will still include Freebase links Cheers, DImitris On Tue, Jun 16, 2015 at 11:44 AM, Mykhailo Drozdov < > wrote:" "Finding related or similar entities in DBPedia" "uI'm thinking of creating some sort of relatedness graph of different DBPedia/Wikipedia pages. I can't find many published papers where people do this but I'm sure someone must have done work in this field before. I'm quite open in what I define as related. For example: \"Britney_Spears\" and \"Christina_Aguilera\" are related in that they both make similar kind of music. \"BMW\" and \"Mercedes-Benz\" are related since they are both car makers and companies from the same country and they have \"similar\" type of brands (\"BMW\" and \"Porche\" should ideally be less related than \"BMW\" and \"Mercedes-Benz\"). \"Microsoft\" and \"Google\" are both IT companies so hence they are related. \"London\" and \"England\" are related in that London is the capital of England. \"London\" and \"Paris\" might be related in that they are both major metropolitan cities (this one might be far fetched?). As you see, I'm kind of broad in what I define as \"related\" or \"similar\". I don't expect to be able to build or find a perfect system but I think the examples above give a hint of what I'm looking for. Basically I want a graph between nodes in DBPedia (Wikipedia). I be happy for any advice or suggestion regarding what papers to take a closer look at if you guys have either worked with this problem before or have stumbled upon good papers that are relevant to this topic. Thanks! uHello, Omid Rouhani schrieb: You may be interested in the DBpedia relationship finder, which finds paths in the RDF graph between two objects: Some information about it, can be found here: Of course, knowing the shortest paths between objects is still different from knowing how similar these objects are. If you are not looking for paths/graphs, but numbers describing similarity of resources/objects, then searching for (dis)similarity measures/metrics for RDF/OWL/Semantic Web will probably bring up a few results, e.g. this one: ftp://ftp.cs.wisc.edu/machine-learning/shavlik-group/ilp07wip/ilp07_damato.pdf Kind regards, Jens uHi Omid, the issue really is how to define similarity. Is IBM more similar to JPMorgan (both companies and located in New York), or to Microsoft (also both companies and in the same business area)? And in order to define this kind of similarity, we would need a DBpedia ontology, which we don't have yet. When we have this ontology, users could define that business area has a similarity weight of e.g. 0.7, whereas location only has 0.2. But again, that's hard at the moment without an ontology. But Jens already pointed to a good approach of him (DBpedia relationship finder) for measuring \"relatedness\". Cheers, Georgi uGeorgi Kobilarov wrote: Georgi, The whole UMBEL effort is about providing a loosely bound ontology that is loosely bound to DBpedia :-) I am hoping to have yourself and others involved with the extraction side of things, look to incorporate additional exploitation of UMBEL during the next DBpedia + 1 (i.e release after next) extraction round. As stated in by recent blog post about the loading of the Yago Class Hierarchy as Virtuoso Inference rules, this is part of a predetermined effort to address the new range of \"Context\" oriented issues that are popping up re. DBpedia (a natural effect of its popularity). We have the instance data out in the Linked Data Cloud, which makes loosely bound Data Dictionary the logical addition to the overall effort. Kingsley uHi Kingsley, that Of course I know UMBEL :) Well, at least I skimmed the blog posts / documentation. Is there an easy *technical* documentation of UMBEL? The project description is quite high-level and abstract, and as a lazy technical person I need simple technical examples However, when I previously mentioned the DBpedia Ontology, I wasn't talking about a concept ontology/taxonomy. I was talking about a (more or less) fixed vocabulary of properties used in DBpedia. So I look at it from an instance data point of view. Which properties are actually used in DBpedia, how can me make their descriptions explicit, clean them up and structure them? And this way, get rid of birthplace, birth_place, place_of_birth; placeofbirth, etc. Georgi uHi Georgi, Georgi Kobilarov wrote: Correct; that has been lacking. As we posted to the LOD mail list, advance documentation may be found at [1] (which may be modified slightly before final posting). We are nearly complete with a new update of OpenCyc (incorporating many of our findings in the initial UMBEL diligence and preparation), which has been a critical path item for a first formal release of UMBEL. Barring some last minute glitches, we will release UMBEL late this week, next week at the latest. Sorry for the delay, but I believe sufficient draft technical documentation is provided in the [2] link. Thanks, Mike [1] [2] uGeorgi Kobilarov wrote: Georgi, I understand the need for the technical docs and examples. These are coming etc> However, when I previously mentioned the DBpedia Ontology, I wasn't Okay, you've seen the first batch of changes that where made i.e .the rules loaded before the recent Yago Classes rules where added. Re. fixed vocabulary, how are you planning to approach this in a coordinated fashion with the rest of the DBpedia community (which is clearly growing? In a sense we current have teams that cover: 1. Extraction 2. Mapping (which includes URI minting) 3. Storage (which includes loading of instance data and inference rules ) 4. Deployment Soren & Jens (Leipzig) are the Dept. heads for 1. Chris, Richard, ad George are the Dept. heads for 2. OpenLink for 3. OpenLink & Richard 4. We don't have an Dept. that overseas the Fixed Vocabulary effort. The output of this Depts. effort is input for the Dept. handling item 3 (as you can see from the cleanup done with Birth related properties). Kingsley" "chembox?" "uHi everyone, first of all thanx for this great project! I was wondering if anyone picked about the chemistry in wikipedia and RDF-ied that for dbpediaBenzene [1], for example, is using 'chembox'. If not, how would I go about and add extraction of information from this template for use in dbpedia? Kind regards, Egon uHi Egon, we had a request on that topic from Christian Leger some days ago. I attach my original reply, which was off-list. Christian/Egon: Maybe you could collaborate? Cheers, Georgi uHi Georgi, On 7/31/07, Georgi Kobilarov < > wrote: I had scanned the archives, but most have overlooked this one. Sorry about that. Christian, I would appreciate if you could let me know if you have made progress. I probably will not work on it this week, but might have a look at the code next week. Egon uI created an SVG diagram of hydroxide which includes RDF in the SVG metadata tag: The metadata includes the latest chembox for that molecule. It parses nicely in the W3C's RDF validator I would like to know if this is a useful format for DBpedia chemistry extractions and if you suggestions for improvements. Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out. ?link=list&sid;=396545433 uHi JJ, Some quick comments w.r.t. chembox extraction below. Note I don't know anything about chemistry. On 14 Oct 2007, at 00:36, JJ wrote: 1. The RDF describes an image, not a chemical compound. So it's OK to have title, creator and so on, but some of the other properties included in the RDF are troublesome. For example: pubchemCID=961 This is not an identifier for the image. It is an identifier for the chemical compound displayed in the image. I think the image and the compound should be separated into two resources. The image resource could then be related to the compound resource using a property such as dbpedia:skeletalImage (or whatever the proper scientific term is). 2. Regarding the dc:identifier triples, I think this is a bit of an abuse of the dc:identifier property, because the value \"pubchemCID=961\" is not just an identifier, but a structured key/ value pair, where the key is the actual identifier. I think it would be more useful to do this: 961 This allows us to SPARQL directly for a certain identifier, without need to do further string processing. 3. A lot of interesting data is inside the CML block, which is an XML format but not RDF. Thus the data inside the CML cannot be SPARQLed or otherwise processed by RDF tools. Is there an RDF mapping for CML, or would it be feasible to create a partial mapping for the purpose of chembox extraction? If not, then I think you could at least duplicate interesting data such as the molecular mass into an RDF property. Best, Richard uHi JJ, Some quick comments w.r.t. chembox extraction below. Note I don't know anything about chemistry. On 14 Oct 2007, at 00:36, JJ wrote: 1. The RDF describes an image, not a chemical compound. So it's OK to have title, creator and so on, but some of the other properties included in the RDF are troublesome. For example: pubchemCID=961 This is not an identifier for the image. It is an identifier for the chemical compound displayed in the image. I think the image and the compound should be separated into two resources. The image resource could then be related to the compound resource using a property such as dbpedia:skeletalImage (or whatever the proper scientific term is). 2. Regarding the dc:identifier triples, I think this is a bit of an abuse of the dc:identifier property, because the value \"pubchemCID=961\" is not just an identifier, but a structured key/ value pair, where the key is the actual identifier. I think it would be more useful to do this: 961 This allows us to SPARQL directly for a certain identifier, without need to do further string processing. 3. A lot of interesting data is inside the CML block, which is an XML format but not RDF. Thus the data inside the CML cannot be SPARQLed or otherwise processed by RDF tools. Is there an RDF mapping for CML, or would it be feasible to create a partial mapping for the purpose of chembox extraction? If not, then I think you could at least duplicate interesting data such as the molecular mass into an RDF property. Best, Richard uInstead of the dc:identifier, what if I use 222 InChI=1/H3N/h1H3 azane will that be better to make the metadata usable? The idea is to use metadata \"within\" an image document, not separating the resources. This can already be done for digital images (Adobe XMP for example), but the metadata is not human-readable as well. But, moreover, I think it is useful to try to embed metadata in any resource (i.e HTML, SVG, png). I'll see what can be done to add RDF-CML elements. uHi JJ/Richard, On 10/21/07, JJ < > wrote: Yes, that would be more useful. I use this approach on rdf.openmolecules.net, which I am slowly converting to use the RDF classes defined in: Would be nice if we could synchronize things. Egon" "Error using import.sh for the abstract Extractor" "uHello, I have been trying for the past few days to import a recent dump (enwiki-20120902-pages-articles.xml) into mysql (using the README.txt However, after sucessfully importing 8,800,000 pages, an array out of bound exception occurs : 8 800 000 pages (66,855/sec), 8 800 000 revs (66,855/sec) [WARNING] java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297) at java.lang.Thread.run(Thread.java:722) Caused by: java.lang.ArrayIndexOutOfBoundsException: 2048 at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(SAXParser.java:392) at javax.xml.parsers.SAXParser.parse(SAXParser.java:195) at org.mediawiki.importer.XmlDumpReader.readDump(XmlDumpReader.java:88) at org.mediawiki.dumper.Dumper.main(Dumper.java:143) 6 more [INFO]" "Lua scripting on Wikipedia" "uHi, I saw few days ago that MediaWiki since one month allow to create infoboxes (or part of them) with Lua scripting language. So my question is, if every data in the wikipedia infoboxes are in Lua scripts, DBPedia will still be able to retrieve all the data as usual ? My other question is mainly concerned by Wikipedia FR, because I don't found the same thing in english, sorry. Since almost one year for the infobox population property we can do something like that : population = {{population number}} Where \"population number\" refer to a number which is on another page. Let me give you an example, the Wikipedia page about Toulouse city, contain this infobox property : | population = {{Dernière population commune de France}} And the value of \"Dernière population commune de France\" is contained in this wikipedia page : So now the problem is that in the xml dump we don't have the real value of the population so it exist a way to have the value and not the \"string\" which represent the value ? I hope that I was enough clear, otherwise don't hesitate to ask me some informations in more about these problems. Thanks for your lights. Best regards. Julien Plu. Hi, I saw few days ago that MediaWiki since one month allow to create infoboxes (or part of them) with Lua scripting language. Plu. uHi Julien On Fri, Apr 5, 2013 at 11:44 AM, Julien Plu < > wrote: I think that there is some time untill all infoboxes start using Lua. Even if they do we still don't know if it will be just for rendering or if it will affect the actual data. This is very fresh so we'll have to wait a while and see how it goes Tthe problem that you are describing is known but we were not aware that it existed in infoboxes too. We faced this when we needed to generate abstracts so we used a working mediawiki clone of wikipedia to render them. Your case is more complicated because we can't know in advance what needs to be rendered or not. You should also talk with Julien Cojan, he is responsible the DBpedia in french Chapter (fr.dbpedia.org). I guess he already faced similar issues. Best, Dimitris uThanks a lot Dimitri for you answer. I will talk about that with Julien Cojan. Best. Julien Plu. 2013/4/5 Dimitris Kontokostas < > uHi Julien, thanks for the heads-up! On 5 April 2013 10:44, Julien Plu < > wrote: I'm not 100% sure, and we should look into this, but I think that Lua is only used in template definitions, not in template calls or other places in content pages. DBpedia does not parse template definitions, only content pages. The content pages probably will only change in minor ways, if at all. For example, {{Foo}} might change to {{#invoke:Foo}}. But that's just my preliminary understanding after looking through a few tuorial pages. I've seen similar structures on Wikipedia de [1] and I think also on pl or cs: the actual data is not in the content pages, but in some template, and is rendered on the content page by rather complex mechanisms. To deal with this, DBpedia could try to expand templates, or maybe just certain templates (we don't want all the HTML stuff). Great generality, but may cause perfomance and other problems. In the worst case, mapping-based extraction could become as slow as abstract extraction. Or we could let people add rules on the mappings wiki about which templates contain data and how the data should be attached to certain DBpedia resources. Of course, determining syntax and semantics for such rules wouldn't be trivial but if we get there, we could implement the data extraction as a preprocessing step: in a first extraction phase, go through the Wikipedia dump, collect and store stuff from these 'data templates', and during the main extraction, pull the data from the store where needed and generate triples. Informally, we already have such a preprocessing phase for the redirects. It would make sense to \"formalize\" it and also use it for other info, e.g. disambiguation pages, inter-language links, resource types, etc. Cheers, JC [1] For example, {{Infobox Gemeinde in Deutschland | Gemeindeschlüssel = 03241001 }} (\"Gemeinde in Deutschland\" means \"community in Germany\", \"Gemeindeschlüssel\" means \"community key\".) The actual data is in pages like uHi all, this is a relevant topic to be address soonish. There is already work to replace the Template:Infobox with a Lua script. Also see the blog post here: Also the Wikidata inclusion syntax is starting to replace values by calls to the repository, see: This will make it increasingly harder to retrieve properties along with their values from Wikipedia dumps without a) interpreting the Lua script results and b) accessing Wikidata. Cheers, Anja On Apr 5, 2013, at 15:40, Jona Christopher Sahnwaldt < > wrote: u@Jona Christopher Sahnwaldt : Lua scripting has been made to create new templates, and these templates will able to be used in the infoboxes. By the way, happy to see that some solutions are in thinking to solve the problem about wikidata at least. I hope that efficients solutions will be found :-) Best. Julien Plu. 2013/4/5 Anja Jentzsch < > uHi, from what I understood the problem which will arise for DBpedia with the introduction of Wikidata, is that actual values will not be available in Wikipedia dumps anymore. Instead, we will end up either finding nothing (see InterWiki links) or finding wikidata parser functions, e.g. {{#property:p169}} as shown here [1] Maybe one approach could be to gather wikidata dumps and build some sort of triples stores, which can be used to resolve actual data during extraction (adding handling of specific parser functions in the extraction framework). WDYT? Cheers Andrea [1] 2013/4/5 Julien Plu < > uHi, Wikidata will provide RDF dumps as well as XML dumps btwThere might also be a third party SPARQL endpoint serving them if we find hosting support. Cheers, Anja On Apr 5, 2013, at 18:12, Andrea Di Menna < > wrote: uOn Fri, Apr 5, 2013 at 9:40 AM, Jona Christopher Sahnwaldt < >wrote: As far as I can see, the template calls are unchanged for all the templates which makes sense when you consider that some of the templates that they've upgraded to use Lua like Template:Coord are used on almost a million pages. Here are the ones which have been updated so far: Performance improvement looks impressive: Tom On Fri, Apr 5, 2013 at 9:40 AM, Jona Christopher Sahnwaldt < > wrote: thanks for the heads-up! On 5 April 2013 10:44, Julien Plu < > wrote: > Hi, > > I saw few days ago that MediaWiki since one month allow to create infoboxes > (or part of them) with Lua scripting language. > Tom uHi, @Anja : Have you a post from a blog or something like that which speaking about RDF dump of wikidata ? The french wikidata will also provide their data in RDF ? This news interest me very highly. Best Julien Plu. 2013/4/5 Tom Morris < > uOn 5 April 2013 19:59, Julien Plu < > wrote: @Anja: do you know when RDF dumps are planned to be available? There is only one Wikidata - neither English nor French nor any other language. It's just data. There are labels in different languages, but the data itself is language-agnostic. uOk, thanks for the precision :-) It's perfect, now just waiting when the dump of these data will be available. Best. Julien Plu. 2013/4/5 Jona Christopher Sahnwaldt < > uHi, For me there is no reason to complicate the DBpedia framework by resolving Wikidata data / templates. What we could do is (try to) provide a semantic mirror of Wikidata in i.e. data.dbpedia.org. We should simplify it by mapping the data to the DBpedia ontology and then use it like any other language edition we have (e.g. nl.dbpedia.org). In dbpedia.org we already aggregate data from other language editions. For now it is mostly labels & abstracts but we can also fuse Wikidata data. This way, whatever is missing from the Wikipedia dumps will be filled in the end by the Wikidata dumps Best, Dimitris On Fri, Apr 5, 2013 at 9:49 PM, Julien Plu < > wrote: uYes Dimitri, but for doing a semantic mirror of Wikidata we need the dump of these data, and then create an extraction and mapping method for creating it. No ? Because for me (don't hesitate to tell if I'm wrong) all what we need is the dump of wikidata, whatever the format but just this. Like that it will be easy (?) to make a mapping between these data and the template include in the infoboxes. Best. Julien. 2013/4/5 Dimitris Kontokostas < > uHi Dimitris, I am not completely getting your point. How would you handle the following example? (supposing the following will be possible with Wikipedia/Wikidata) Suppose you have {{Infobox:Test | name = {{#property:p45}} }} and a mapping {{PropertyMapping | templateProperty = name | ontologyProperty = foaf:name}} what would happen when running the MappingExtractor? Which RDF triples would be generated? Cheers Andrea 2013/4/5 Dimitris Kontokostas < > uHi all, the according code to generate RDF for items is still in review but hopefully live soon. In the meantime we are working on the maintenance script for generating RDF dumps and they will be generated along with the other dumps then. Cheers, Anja On Apr 5, 2013, at 20:06, Jona Christopher Sahnwaldt < > wrote:" "Where is that "raw" wikipedia infobox dataset?? and" "uWhat I actually want to query: -Android devices with a display of at least 800x480 Out of obvious reasons I reduced that to: -list all android devices with display information included After a couple of days I came up with this query: SELECT DISTINCT ?subject, ?display { { ?subject . } UNION { ?subject a . } OPTIONAL { ?subject ?display } } My problem: Where is that “raw” infobox dataset, which promises “complete coverage of all Wikipedia properties” with minimal clean-up? The downloadable infobox_properties file and the endpoint return only crap for the display property like: “4” or empty values! (Try the query yourself!) The live.dbpedia.org/sparql endpoint returns more, but still useless. I’m aware of the missing ontology mappings for the mobile phone infoboxes ( Should dbpedia live not import the raw values when there are no mappings? The wikipedia template uses micro-templates like {{convert|2.1|in|mm|abbr=on}}, how does dbpedia handle that? How does IntermediateNodeMapping separate the property string?? By spaces alone? Then how to handle this? | display = [[TFT LCD]], {{convert|3.2|in|mm|abbr=on}} diagonal 320×480 px HVGA 1.5:1 aspect-ratio wide-screen 256K colors As far as I understand, CustomMappings are not implemented via media-wiki, is that right? – Would be nice to have some kind of RegexMapping, with: 1) a regular expression retrieves one or more values (named groups) 2) multiple regular expressions can be given 3) values retrieved can be subject to some mathematical/conditional cleanup (e.g. if first_var < second_var then “short_side” = first_var; “orientation” = portrait) 3b) and some more examples: if xyGA = HVGA then “short_side” = 320 3c) and maybe some extra calulations: “dpi” = sqrt(+)/ So, how do I get that display info out of dbpedia at all? And how to improve the situation for easy retrieval of both display dimensions? Thanks, Robert uOn 11/9/11 8:50 PM, Robert Siemer wrote: uOn 10/11/11 10:34, Kingsley Idehen wrote: uOn 11/10/11 12:29 AM, Robert Siemer wrote: uRobert, My guess is that it is a problem with parsing templates when they are in property values, as you already seem to have found out. About your initial question: Where is that \"raw\" wikipedia infobox dataset?? I imagine that people decided to abort outputting templates within the values as they would look \"broken\" to the naked eye. I can already see what kinds of questions would show up in the list saying that the extraction is broken, when it would actually just be a raw representation of the content in Wikipedia. Would be nice to have some kind of RegexMapping Yes, I think that is a great idea! Unless one of the core developers steps up to say that this is a bad idea, or that it won't work for some reason, I would encourage you to give it a try and share your results with the list. I think you've already gotten a hold of this, but just in case, all of the code is available from here: Cheers, Pablo On Thu, Nov 10, 2011 at 2:50 AM, Robert Siemer < > wrote:" "Special characters in IRI's > which encoding?" "uHi, I noticed some of the special characters in IRI's are not available in UTF8. For instance this one: becomes in UTF8: Which encoding scheme should I use in my code to get rid of these question marks? Thanks, Roland uExcuse me, this was a typical layer 8 problem; I should have specified UTF-8 when writing to file from a box configured for ISO-8859-1 Roland On 10-09-14 20:36, Roland Cornelissen wrote: uHi How can i unsubscribe from this mail list, i no longer use Dbpedia Thanks Regard On Wed, Sep 10, 2014 at 10:01 PM, Roland Cornelissen < > wrote:" "Take 2: How To Do with deal with the Subjective Matter of Data Quality?" "uAll, Clearer Subject Heading. Increasingly, the issue of data quality pops up as an impediment to Linked Data value proposition comprehension and eventual exploitation. The same issue even appears to emerge in conversations that relate to \"sense making\" endeavors that benefit from things such as OWL reasoning e.g., when resolving the multiple Identifiers with a common Referent via owl:sameAs or exploitation of fuzzy rules based on InverseFunctionProperty relations. Personally, I subscribe to the doctrine that \"data quality\" is like \"beauty\" it lies strictly in the eyes of the beholder i.e., a function of said beholders \"context lenses\". I am posting primarily to open up a discussion thread for this important topic. uOn 4/8/11 2:25 PM, Frank Manola wrote: Yes, and go one step further by making the *objectivity* part of the data. Basically, discussion/conversation/debates about the data should be part of the zeitgeist of any Data Object or collection of Data Objects. Yep! As stated above. Agreeing to Disagree is one of the most powerful aspects of the Web and the emerging Web of Linked Data :-)" "Error with the PHP mediawiki" "uHi, I installed mediawiki on my web server it worked like a charm, the version that I use is origin/REL1.21. Now after to import every templates I have to download your \"LocalSettings.php\", \"ApiParse.php\" and \"DBPediaFunctions.php\" files. I put your \"LocalSettings.php\" instead of mine, I change just the database user password. I replace the \"ApiParse.php\" file by your version too and put the \"DBPediaFunctions.php\" file in the root directory of mediawiki. After I download every extensions listed in the \"LocalSettings.php\", by the way \"Timeline\" extension doesn't exist anymore, now it's \"TimelineTable\" so I downloaded this one and modified the \"LocalSettings.php\" file to make replacement. And when I open this page \" have a blank page and this error in the apache log file : [Tue Apr 16 17:50:14 2013] [error] [client 127.0.0.1] PHP Fatal error: Call to a member function disable() on a non-object in /var/www/mediawiki/includes/GlobalFunctions.php on line 2164 [Tue Apr 16 17:50:14 2013] [error] [client 127.0.0.1] PHP Stack trace: [Tue Apr 16 17:50:14 2013] [error] [client 127.0.0.1] PHP 1. MWExceptionHandler::handle() /var/www/mediawiki/includes/Exception.php:0 [Tue Apr 16 17:50:14 2013] [error] [client 127.0.0.1] PHP 2. MWExceptionHandler::report() /var/www/mediawiki/includes/Exception.php:713 [Tue Apr 16 17:50:14 2013] [error] [client 127.0.0.1] PHP 3. MWException->report() /var/www/mediawiki/includes/Exception.php:643 [Tue Apr 16 17:50:14 2013] [error] [client 127.0.0.1] PHP 4. wfHttpError() /var/www/mediawiki/includes/Exception.php:265 I precise that my database is called \"frwiki\". Anyone can know why this error occur ? Thanks. Best. Julien. Hi, I installed mediawiki on my web server it worked like a charm, the version that I use is origin/REL1.21. Now after to import every templates I have to download your 'LocalSettings.php', 'ApiParse.php' and 'DBPediaFunctions.php' files. I put your 'LocalSettings.php' instead of mine, I change just the database user password. I replace the 'ApiParse.php' file by your version too and put the 'DBPediaFunctions.php' file in the root directory of mediawiki. After I download every extensions listed in the 'LocalSettings.php', by the way 'Timeline' extension doesn't exist anymore, now it's 'TimelineTable' so I downloaded this one and modified the 'LocalSettings.php' file to make replacement. And when I open this page ' Julien. uHi Julien, you can take a look at this thread. Gaurav (cc) had the exact same issue. if the thread doesn't help, maybe Gaurav could give you some hints. Cheers, Dimitris On Tue, Apr 16, 2013 at 7:05 PM, Julien Plu < > wrote: uThanks Dimitris, I will watch this tomorrow because I'm not behind my working machine until tomorrow morning. So I will let you know :-) Best. Julien. 2013/4/16 Dimitris Kontokostas < > uOk, the problem come from the \"LocalSettings.php\" file coming from the repository. I don't know where but it come from here. So I think instead of providing the file, maybe explain just what we have to change inside will be a better solution ? Best. Julien. 2013/4/16 Julien Plu < > uYou are correct, we shouldn't tell people to use the whole file. Maybe we can put most of our changes / additions into one php file DBpediaSettings.php, put that file into the repo, and tell people to add that file to their MediaWiki folder and add something like include \"DBpediaSettings.php\" at a certain place in their LocalSettings.php. I'm not sure if that will work, I don't know enough about PHP in general and our changes in particular, but could you give it a try? Or maybe we need multiple DBpedia settings files because we have to modify different parts of LocalSettings.php. The last message in the thread Dimitris pointed to explains a different, somewhat like less elegant approach: Delete LocalSettings.php. Go to appropriate URL is) in your browser. You should see a page asking for your configuration. Enter your settings. When you are done, you should have a new LocalSettings.php file. Now comes the hard part. Figure out what we changed in our LocalSettings.php. In other words, make a git diff between the first and last version of . Now, also make these changes in your own LocalSettings.php - that means, copy a few lines here and there. To test if things are working, go to see usage instructions. Now try the extraction again. I keep my fingers crossed. :-) On Apr 17, 2013 12:09 PM, \"Julien Plu\" < > wrote: uI did almost exactly the same thing. And I don't know enough too about PHP but we have 2 things to change inside, it's really easy to ask to the users to change just these changements. By the way, I have a simple question with no meaning with this topic, but where are put the results N-Triples files ? Best Julien. 2013/4/17 Jona Christopher Sahnwaldt < > uThe triples files are created in the same folders where the Wikipedia xml dump files are stored. On Apr 17, 2013 1:47 PM, \"Julien Plu\" < > wrote: uAh yes, thanks :-) I think that I will begin to review the guide for other things than MySQL part. Best. Julien. 2013/4/17 Jona Christopher Sahnwaldt < >" "Importation process error" "uHi, When I run this command \"/clean-install-run import\" an error occur : [INFO] uWell, this file does not exist (or is not readable): java.io.FileNotFoundException: /var/www/mediawiki/core/maintenance/tables.sql I think you should check out a clean, current mediawiki (version 1.22wmf1 or so) from their git repo. It's documented on JC On Apr 16, 2013 2:42 PM, \"Julien Plu\" < > wrote: uMy bad, it was from a bad configuration in the pom.xml. All my apologies for the disturbing. Best. Julien. 2013/4/16 Jona Christopher Sahnwaldt < >" "1st Challenge on Question Answering over Linked Data (QALD-1)" "uThe challenge on Question Answering over Linked Data is on!. Please find below important information. We would be happy if you participate. [Apologies for cross-posting] 1st Challenge on Question Answering over Linked Data (QALD-1) collocated with the corresponding workshop at the Extended Semantic Web Conference (ESWC) * Motivation * While more and more semantic data is published on the Web, in particular following the Linked Data principles, the question of how typical Web users can access this body of knowledge through an intuitive and easy-to-use interface that hides the complexity of the Semantic Web standards becomes of crucial importance. Since users prefer to express their information need in natural language, one of the main challenges lies in translating the user's information needs into a form such that they can be automatically processed using standard Semantic Web query processing and inferencing techniques. In recent years, there have been important advances in semantic search and question answering over RDF data, and in parallel there has been substantial progress on question answering from textual data as well as in the area of natural language interfaces to databases. This shared task and the associated workshop aim at bringing together researchers from these communities that accept the challenge of scaling question answering approaches to the growing amount of heterogeneous and distributed Linked Data. Our long-term goal is to understand how we can develop QA approaches that deal with the fact that i) the amount of RDF data available on the Web is huge, ii) that this data is distributed and iii) that it is heterogeneous with respect to the vocabularies or schemas used. * Shared Task * Question answering systems of all kinds are invited to participate in the shared task of processing natural language queries and retrieving relevant answers from a given RDF dataset, thereby providing an in-depth view of the strength, capabilities and shortcomings of existing systems. Please note that although some of the questions are quite complex, nonetheless we would like to encourage everybody to participate in the challenge even if they can only successfully process a subset of the questions. Although the competition is tailored towards question answering systems based on natural language, we strongly encourage other relevant systems and methods that can benefit from the evaluation datasets to also report their results. * Datasets * We provide two datasets: DBpedia 3.6 and MusicBrainz. They can either be downloaded or accessed via a SPARQL endpoint. In addition, we provide 50 training questions for each dataset, annotated with corresponding SPARQL queries and answers. Later, during the test phase, participating systems will be evaluated with respect to precision and recall on a set of 50 similar questions. * Submission * Submission of results will be possible via an online form, available at the following site (from February 7th on). Submission of results on training data will be allowed at any time. Participants will receive the results on the training data for every submission. Submission of results on test data will be possible starting from April 1st and close on April 10th. Participants can submit as many runs as they want but will not receive any feedback. * Schedule * Release of QA training dataset and instructions: Feb 3 Release of QA testset: March 28 Submission of results by participants: April 1 Close of result submission on test data: April 10 * More information * For detailed information as well as links to the datasets and training questions, please check the workshop website: If you want to be regularly updated with information about the task, there is a QALD-1 mailing list, to which you can subscribe at the following location: All the best!" "Quick Tweak to DBpedia Real-time Edition" "uAll, When I read the notice from Sebastian earlier today, re. the new DBpedia & Wikipedia real-time variant, it dawned on me to use this new DBpedia instance to demonstrate a fundamental feature of DBpedia deployment. Basically, what I've referred to in prior commentary as a \"subtle nuance\" added to the Linked Data deployment mix. Sequence: 1. Our DBpedia partners at University of Leipzig put out an initial bet cut of DBpedia with real-time links to Wikipedia 2. Their demo links are to the SPARQL endpoint which produces a basic Web results page without any live links (de-referencable URIs) 3. I ping Sebastian and indicate to him that by simply installing the DBpedia VAD package (Virtuoso's equivalent of an RPM) you will get the same Linked Data deployment used by the live DBpedia instance, our EC2 AMIs, and the DBpedia on Virtuoso 6.0 instance that we are currently hot staging. Post the above (about 3 mins or less for me to install the VAD from Burlington, MA into the instance in Leipzig, Germany), we now have: 1. 2. An owl:sameAs link to 3. Local URI dereferencing on the Leipzig Virtuoso instance (irrespective of the actual URIs in the Quad Store, think of this as outbound rewrite rules to complement inbound rewrite rules via SPARQL). To conclude, this is a simple demonstration of how to address the problem of Linked Data Set propagation en route to Linked Data Web resilience i.e., a Linked Data Web where URIs may come and go, but the actual data (URI / Pointer referents) persist. The more replicas the more resilient the Linked Data Web becomes." "Next DBpedia release ?" "uHi, Just wondering if there was a schedule/roadmap for the next release(s) of the DBpedia dataset ? It's coming up to half a year since the last one was released and it would be nice to pull in some of the updated data from Wikipedia at some point. If there isn't going to be one soon, then does anyone have any stats on how long it takes to run the dbpedia extraction scripts (1-2 days) ? As I'd like to update my local copy of the data. Thanks, Rob uHi Rob, the issue was the availability of Wikipedia dumps. Dumps in April and Mai failed. A new working Wikipedia dump was released one week ago. The extraction is currently running on servers in Leipzig. @Jens/Sören: Do you have time to start a new extraction? Rob, the extraction should run approx. 2 days. And I would love to get some feedback if the extraction framework works for you :) Cheers, Georgi uUups, a little correction: uHello, robl schrieb: We'll probably start another extraction at the end of June. Christian Becker wants to update the GeoExtractor soon, the Yago mapping should be updated, and hopefully a few bugs will be fixed. Depends on your machine and on whether you've imported the Wikipedia dumps already. If everything runs smoothly, you need about 10 days on an average computer to import the dumps and extract the data sets. Of course, you can choose to extract only the data sets and language versions you need to reduce the runtime of the extraction script. Kind regards, Jens" "Mappings for template redirection explanation" "uHi, i am a little confused on how mappings deal with template redirection. e.g. when template A redirects to template B what happens to articles using template A or B a) if we define mapping only for A b) if we define mapping only for B Thanks! Cheers, Dimitris Hi, i am a little confused on how mappings deal with template redirection. e.g. when template A redirects to template B what happens to articles using template A or B a) if we define mapping only for A b) if we define mapping only for B Thanks! Cheers, Dimitris uOn Fri, Mar 18, 2011 at 08:58, Dimitris Kontokostas < > wrote: As far as I know, the procedure is the following: A" "Area code madness in DBPedia" "uHi all, At the VU University Amsterdam, we're teaching various courses on the Semantic Web in which (understandably) DBPedia plays an important role. Over the past year's we have carefully groomed and curated a collection of example SPARQL queries against DBPedia. These queries revolve around cities with the \"020\" area code. To my surprise, these queries suddenly stopped working. Apparently because there has been a new release of DBPedia in which this was changed. howeverit turns out that they work sometimes but do not work at other timesmadness! ;-) A bit of background: The examples use the property (i.e. a curated property), and assume the area codes are represented as string literals, e.g. \"020\", sometimes with an erroneous language tag. It now seems that in some versions of the DBPedia endpoint, the property < data: there is no place name with a dbpedia-owl:areaCode property. Even though the areaCode property is still defined in the ontology (at least, that's what I see when I point my browser at it). In addition to that (and much much worse), in some versions of the DBPedia endpoint, area codes are represented as *integers*, but are syntactically still presented as the value they had as strings. E.g. the area code for Amsterdam is represented as 020 rather than \"020\". Needless to say that 020 as an integer is equal to 20hmthat's not what we want Area codes simply are not integers, they are strings, because the actual digits matter. (also, application developers should be able to rely on relative stability of statements that use the dbpedia ontology namespace, also, the ontology should only change monotonously: only add classes, properties etc., don't remove them!) In short: A couple of months ago, the area codes were represented as language tagged strings using the dbpedia-owl:areaCode property Yesterday morning at approx. 10 am, the area codes were represented as integers using the dbpedia-prop:areaCode property Yesterday evening, at approx. 10 pm, the area codes were represented as language tagged strings using the dbpedia-owl:areaCode property Today, at approx. 1 pm, the area codes were represented as integers using the dbpedia-prop:areaCode property It would be exceedingly nice if this could be fixed. Perhaps it has something to do with load balancing and different stores not being synchronized properly? -Rinke PS area codes may be stringsbut they don't have to be language-tagged ;-) Hi all, At the VU University Amsterdam, we're teaching various courses on the Semantic Web in which (understandably) DBPedia plays an important role. Over the past year's we have carefully groomed and curated a collection of example SPARQL queries against DBPedia. These queries revolve around cities with the '020' area code. To my surprise, these queries suddenly stopped working. Apparently because there has been a new release of DBPedia in which this was changedhoweverit turns out that they work sometimes but do not work at other timesmadness! ;-) A bit of background: The examples use the < u0€ *†H†÷  €0€10  `†He uSure, Here's one for the area-codes-as-integers: and here's the original query for area-codes-as-strings: Thanks, Rinke On Tue, Sep 8, 2015 at 2:43 PM Kingsley Idehen < > wrote: u0€ *†H†÷  €0€10  `†He uHi Kingsley, Thanks for your help. I am aware that the two queries use a different predicate; that's part of the problem. The issue is that depending on the time at which I try the queries, one query will return results, and the other won't (or the other way around). -Rinke On Tue, Sep 8, 2015 at 4:58 PM Kingsley Idehen < > wrote: uHi Rinke & thanks for your report, besides Kinglsey's comments, some notes from my side On Tue, Sep 8, 2015 at 2:19 PM, Rinke Hoekstra < > wrote: Since 2014 release (09/2014) we removed language tags from all xsd:strings and introduced rdf:langString according to RDF 1.1 so dbpedia-owl:areaCode (not dbp:langCode) has no language tags in our dumps for 1+ year. dbp:areaCode comes from the infobox extractor that provides raw (and many time inaccurate) data. The datatype of dbp:areaCode is decided on every triple instance based on the data we get by a greedy algorithm Nothing is denoted, we use the mappings wiki to map infobox templates to rdf, I can imagine that a template in Wikipedia possibly changed and the mapping was not adjusted accordingly. If you can provide example resources we can identify the source of the error easier and fix it. as mentioned before, dbp:areaCode is not to be trusted, provided only for completion. dbpedia-owl:areaCode should be consistent all times, if you find a counter example please report it. in general we agree but sometime you need to break some eggs to be able to move forward and provide more consistent data. However, I don't think we made any breaking changes to dbpedia-owl:areaCode this should be before August 2014. If you found it afterwards it is probably a bug see above comments regarding dbp (Again) This should be a bug if true see above comments regarding dbp u0€ *†H†÷  €0€10  `†He uHi Kingsley and Dimitris, Thanks again for your replies. I suggest we leave it for now. I really wasn't asking you to solve my problemit's trivial, I can come up with other equally relevant SPARQL examples if needed. I am well aware of the instability of the DBPedia property namespace, and only resorted to it because my normal route appeared to be blocked. I apologize if my initial email was overly polemic. I was just as surprised as you are that DBPedia behaved differently at different times: it shouldn't happen. As Kingsley says, the data is relatively stable. That's the reason I sent the email: just in case there was some weird bug on your end. I cannot provide proof other than what I told you. For now, a query on: ?city dbpedia-owl:areaCode \"020\"^^xsd:string seems to work, so I'm good to go. I'll keep a backup of the DBPedia-property namespace-based solution just in case the temporal flux appears again. (and of course, I'll notify you immediately once that happens, with a full report) -Rinke On Tue, Sep 8, 2015 at 6:12 PM Kingsley Idehen < > wrote: uNote it is a general Liked Data problem that if you put together data from the wild you will wind up with different ways of writing literals as well as URIs that compete with literals. SPARQL, as it is today, produces the same answers you would get from a relational database if the Unique Name Assumption applies. This is just as true for literals as it is for URIs. Some examples * it is \"reasonable\" for people to write \"8\"^^xsd:byte, and otherwise use many different data types when they want to express an integer. * say you are aggregating data into a vocabulary like foaf, once again there are multiple ways that people will encode email addresses no matter what you tell them, for instance as a literal string is reasonable, but also a mailto: URI is reasoanble. One of the missing links in RDF tooling is something to normalize these things. In the wider world, the U.N.A. definitely does not hold, but if you can enforce it in an area, people know how to write SPARQL queries and interpret the results uOn 08.09.2015 22:08, Paul Houle wrote: u0€ *†H†÷  €0€10  `†He" "Shock Troopers of the Edit Wars?" "uI was working on a freebase <-> dbpedia mapping that doesn't destroy dbpedia, so I had the idea of using the wikipedia page id's from freebase to look up dbpedia resources, and the 'key' to that on the dbpedia side is in the page_ids_en_nt.bz2 in there I notice a really curious phenomenon, that there's not a 1-1 correspondence between wikipedia page ids and wikipedia pages, for instance: [ apps]$ bzgrep 'wiki/SS>' ~/dbpedia_3.5.1/page_ids_en.nt.bz2 \"27041\"^^ . \"198274\"^^ . \"14524464\"^^ . Anyway, this strikes me as wrong, but I can imagine that something like this might happen if there was a page called 'SS' that got renamed, and then somebody created a new one, and then that got renamed, and so forth. Right now if I look at dbpedia, I see in Wikipedia, however, this redirects to looking closely at the dbpedia page for \"SS\", I think there's some confusion with this rather nicer fellow: Anyway, I can believe that this has got something to do with the root cause of the general degradation of key integrity that I've seen in dbpedia 3.5." "Help for a SPARQL query on dbpedia" "uHi I'd like to select every movie on DBpedia which release date is missingbut I don't know how to do it. I even tried to put a filter to check if there is an empty string but it doesn't work. This is how I get started: PREFIX db-ont: PREFIX db-work: PREFIX owl: PREFIX rdf: SELECT ?id ?uriFB ?data_uscita WHERE { ?id rdf:type db-ont:Film . ?id owl:sameAs ?uriFB . OPTIONAL {?id db-ont:releaseDate ?data_uscita .} } LIMIT 100 Could anybody help me please? Non sei a casa? Accedi a Messenger dal Web. default.aspx uYou could use the bound operator to check if a variable is bound to a value or not. So this query should work for you: SELECT ?id ?uriFB ?data_uscita WHERE { ?id rdf:type db-ont:Film . ?id owl:sameAs ?uriFB . OPTIONAL {?id db-ont:releaseDate ?data_uscita } FILTER (!bound(?data_uscita)) } LIMIT 100 Regards, Sören" "connections" "uDanny Ayers wrote: Greg, Please add some clarity to your quest. DBpedia the project is comprised of: 1. Extractors for converting Wikipedia content into Structured Data represented in a variety of RDF based data representation formats 2. Live instance with the extracts from #1 loaded into a DBMS that exposes a SPARQL endpoint (which lets you query over the wire using SPARQL query language). There is a little more, but I need additional clarification from you. uKingsley, how do I find out when to plant tomatos here? On 17 April 2010 19:36, Kingsley Idehen < > wrote: uHugh, I don't disagree with what you are are saying, but would like to express that the question of things being fit for purpose depends on the purpose. There is no way the web will ever be 100% reliable, the tools we use to interact with it have to take that into account. On 18 April 2010 01:14, Hugh Glaser < > wrote: uTwo seconds after hitting post I wish to amend that - the web should already be about 100% reliable, given things like 404s and 500s - whether the information is reliable is another matter. On 18 April 2010 09:09, Danny Ayers < > wrote: uDanny Ayers wrote: And you find the answer to that in Wikipedia via ? Of course not. Re. DBpedia, if you have a Agriculture oriented data spaces (ontology and instance data) that references DBpedia (via linkbase) then you will have a better chance of an answer since we would have temporal properties and associated values in the Linked Data Space (one that we can mesh with DBpedia even via SPARQL). Kingsley uThanks Kingsley still not automatic though, is it? On 18 April 2010 22:38, Kingsley Idehen < > wrote: uadasal wrote: Well, Agriculture oriented data will be emerging any minute now re. Linked Open Data Cloud. DBpedia is but one Lookup Data Space re. Linked Data, there will be many others with Domain Specificity. The BBC, New York Times, Reuters, are examples of others to come. Each uses DBpedia for looking up Names. In reality, your quest for data should trigger the creation of related data spaces. Our problem right now is that we've too much time on the \"Describing what the heck we are doing\" front instead of just doing it! DBpedia is an example of \"Just Doing It!\". Imagine Linked Data without it, at best hypothesizing about the hypothetical. No DBpedia is one Table (in a sense) within a Federated Database, other Tables will be created and/or discovered in due course. Lets make data rather than waste time on not being able to answer all queries right now. Do you know all the answers to everything in our existence? Of course not, so why demand that of a man made medium such as a federated database exposed via the World Wide Web? Thats trivial, if you can invest time in creating or locating a Linked Data Space for the Agriculture Domain. I am sure such a thing exists right now, I just haven't had the time to look into its existence etc> So some general data is classified along those lines. That would be I can't really say :-) I just know you need a Linked Data Space for the Agriculture domain and the answer to your question will be solved. Kingsley uDanny Ayers wrote: Is it \"Automatic or Nothing?\" . What's mechanical to Person A might be automatic to Person B, both are individual operating with individual context lenses (world views and skill sets). What I can say is this: we can innovate around the Outer Join i.e., not finding what you seek triggers a quest for missing data discovery and/or generation. Now, that's something the Web as a discourse medium can actually facilitate, once people grok the process of adding Structured Data to the Web etc Kingsley uwrote: We are gradually moving to things like this under the general banner of Annotations and Data Syncs. Ironically, its 2010 and still don't even have DDE (a 1980's technology) re. data change notification and subscription etc Anyway, these things are coming, pubsubhubbub applied to linked data, annotations (simply UIs for 3-Tuple conversations) etc I call this Data Spaces and Data Driven Discourse, its all coming :-) BTW - Twitter may also help accelerate comprehension and appreciation of what you seek. Many sources of solutions are taking shape etc Very good point, by the way! Kingsley uadasal wrote: Make a Linked Data SOS call, in some form: plain English (or other language) mail, tweet etc., a URL to nothing (i.e., I would someday like to access Structured Data from here), whatever. Key thing is making a request for the structured data you couldn't find or the query that you would like answered (as you did re. Tomatoes). The process of discovering, scraping, and transforming is getting more automated by the second in a myriad of ways. Here is an old animation showing how the process of Sponging works re. Generation of RDF based Linked Data from existing Web accessible resources [1]. And you can pass that through URIBurner [2][3] and start the process of exploring a progressively constructed Linked Data graph via the Descriptor Documents it generates from the Del.icio.us links. Depends, Data is not only like Electricity, it carries the Subjectivity factory of Beauty :-) Of course not, what you need is Search++ (Precision Find across progressively assembled structured linked data meshes) [4] . Take a look at my collection of saved query results and queries (the URLs are hackable). If it stumbles across the DBpedia Data Space on the way, naturally [5]. Always. Links: 1. 6XZy2Q" "DBPedia Lookup: Service Temporarily Unavailable" "uHi, Anyone is aware why the DBPedia Lookup Service is unavailable? And when can it be expected to be up and running again? Best, Karen Hi, Anyone is aware why the DBPedia Lookup Service is unavailable? And when can it be expected to be up and running again? Karen uHey, It goes down from time to time, just needs somebody with access to give it a kick. You can host your own version quote easily though Pablo, do you know if the current version has the recent bug fix in? I hoped it would add a bit more stability :) Cheers, Matt On 12 December 2012 00:14, Karen Stepanyan < >wrote: uHi, It could save some time if it was restarted automatically with a crontab entry. For local instance of lookup, I wrote bash script runLookup.sh : and a contab entry: Cheers, Julien" "Slow Query Initially" "uI created a web application that displays information queried from dbpedia. It works fairly well once it's been running for a little while, but when it starts up, and performs it's first 5 to 6 queries, the results come back very slowly (or possibly incomplete). Once I've made several queries I guess the data is cached because things are fairly snappy. But I get new, not previously queried results in a reasonable time as well, so I'm not sure the improved speed is due to caching data. What is the reason for the very slow result access when I start to query dbpedia? Do you have any suggestions on how to make this perform better? My web app times out the initial calls and appears to not work for a while when it's first started. I guess I could query and store results in my own database prior and then my web app could query this other database, but I'm not very fond of that solution. Thanks Brian uBrian Hardy wrote:" "You and your projects at OKCon" "uDear all, As you will be aware OKCon 2011 is approaching fast: June, 30th & July, 1st We are delighted to announce the release of the OKCon 2011 programme: We are also thrilled about the fantastic line up of speakers: OKCon will be buzzing with open knowledge and open data enthusiasts and is an excellent opportunity to meet people and do thins! It's also a great place to distribute promotional items to get people involved in your projects. If you have banners, flyers, posters or stickers to promote a projects we will gladly help to find space to display them at the venue. If anyone would like to post items to Germany ahead of the conference, please send them to the OKF Germany office and we will make sure they get to the conference venue for you. OKF Deutschland, Prenzlauer Allee 217, 10405 Berlin, Germany To make sure members of the community, working groups and community ambassadors don't miss out, we have created the special *discount code*: OKBERLIN for a €5 discount on your ticket. Simply enter the code when registering at We look forward to meeting you at our wonderful venue, Kalkscheune: All the best Daniel" "Getting started" "uHi all, I guess an introduction is in order: I'm a researcher at the Mihailo Pupin Institute (Belgrade, Serbia), currently working on the LOD2 project ( I would be interested in helping out with some (Serbian) Wikipedia/Wiktionary extraction projects, which I gladly accepted. I took a look at the i18n guide, and I have to admit I can't find a clear path I need to followI have a feeling it sends me back and forth between pages (starting with the very first sentence) and appears to be outdated (I can't find the config file). I was wondering if anyone could help me out by outlining the general steps (I'll try finding the details in the guide/documentation), so I can get started. Thanks in advance! Best regards, Uro¹ Milo¹eviæ uHello I am Abhishek Kumar a Computer Science undergrad from India.I want to get involved into dbpedia development.I have cloned dbpedia extraction-framework and am trying to build it.Can someone please help me with some links from where I can understand the code base? I am a beginner in open source though I've previously written some NLP and data retrieval programs. Thanks Abhishek Kumar Hello I am Abhishek Kumar a Computer Science undergrad from India.I want to get involved into dbpedia development.I have cloned dbpedia extraction-framework and am trying to build it.Can someone please help me with some links from where I can understand the code base? I am a beginner in open source though I've previously written some NLP and data retrieval programs. Thanks Abhishek Kumar uHi Abhishek & welcome :) We are always glad when new people want to join our community here's some quick links - documentation - if you're new to DBpedia I suggest you read our latest paper: - we have some beginners warm up tasks here: we also use the DBpedia developer list for technical questions Cheers, Dimitris On Fri, Jan 8, 2016 at 7:57 PM, Abhishek Kumar < > wrote:" "dbpedia.org down? (and up and down)" "uIs anyone else seeing problems with the dbpedia.org SPARQL API at the moment? Can anyone in the know let us know when it might be back to normal service? Thanks, John On 21/4/09 00:48, \"Ryan Shaw\" < > wrote: uHi John, Apologies, It is back now it was down momentarily to resolve the encoding issue reported below Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 21 Apr 2009, at 15:51, John Muth wrote:" "Problem with the MySQL installation script" "uHi, The extraction tool has finished to compile and to download the dumps so now I install MySQL with the \"dump/src/main/bash/mysql.sh\" script, and when I run this script with the \"sudo ./src/main/bash/mysql.sh install /etc/mysql/\" command I fall on this error : ./src/main/bash/mysql.sh: ligne 51: ./scripts/mysql_install_db: No such file or directory And indeed this script doesn't exist in the repo. I have to download it somewhere ? Best. Julien. Hi, The extraction tool has finished to compile and to download the dumps so now I install MySQL with the 'dump/src/main/bash/mysql.sh' script, and when I run this script with the 'sudo . /src/main/bash/mysql.sh install /etc/mysql/' command I fall on this error : ./src/main/bash/mysql.sh: ligne 51: ./scripts/mysql_install_db: No such file or directory And indeed this script doesn't exist in the repo. I have to download it somewhere ? Best. Julien. uHi Julien, reading the mysql.sh script you should first set the env variable MYSQL_HOME to you MySQL installation folder. The script is first accessing that folder and then launching some MySQL scripts/commands, link mysql_install_db [1] Cheers Andrea [1] 2013/4/16 Julien Plu < > uaaaa ok we need to download the MySQL archive before I thought the script did it for us. Maybe add this precision in the documentation ? Thanks :-) Best. Julien. 2013/4/16 Andrea Di Menna < > uI quickly put that script together when I ran the abstract extraction last year, and I never expected it to have so many users. :-) I'm sorry that it's so badly documented and not really finished. I'll improve it in May. Don't have time now. :-( JC aaaa ok we need to download the MySQL archive before I thought the script did it for us. Maybe add this precision in the documentation ? Thanks :-) Best. Julien. 2013/4/16 Andrea Di Menna < > uPS: It would be cool if you could add a few lines to or even improve the script itself with better error and usage messages and send a pull request. Thanks! On Apr 16, 2013 12:02 PM, \"Jona Christopher Sahnwaldt\" < > wrote: uOk so if I have your approval I will modify the step-by-step guide. And maybe instead of using a script we can put a copy of a mysql configuration file that users must have, no ? Less work (you don't have a script to maintain) and more easy for users. What do you think ? Unless there is a minimal or maximal version of mysql to use ? Best. Julien. 2013/4/16 Jona Christopher Sahnwaldt < > uHi Julien, You are free to improve any page on our wiki :) Regarding the MySQL script, if you think a conf file is easy to create and use, you can make a pull request and we 'll make the necessary tests. Best, Dimitris On Tue, Apr 16, 2013 at 1:14 PM, Julien Plu < > wrote: uOk thank you :-) I finish all the extraction process, and after I will modify the guide to include what I think is missing. For the conf file I will make a pull request with mine if everything is ok during all the process. Best. Julien. 2013/4/16 Dimitris Kontokostas < > uThe problem with having a whole configuration file in the repo is that the file probably contains many other settings besides the ones we need, and it changes with each MySQL version. When we use command line parameters, we don't have to change any MySQL config files, we just add the three or four settings that we need, and they don't change between MySQL versions. Maybe there is a way to tell MySQL, \"here's an addtional config file whose settings should override the ones in the main config file\". Then we could check in a minimal config file with just our settings. But taking a large config file from a MySQL installation, changing a few lines and then putting the modified version into the repo leads to more problems than it solves. We have the same problem with MediaWiki: we once took LocalSettings.php and ApiParse.php from a MediaWiki installation, modified a few lines and put the whole files into our repo. But in newer versions, the files look quite different, and the versions from our repo just don't work. Users have to edit the files and add the stuff needed for the abstract extractor. Cheers, JC On 16 April 2013 15:39, Julien Plu < > wrote: uAs you wish, it's just a standard configuration with 3-4 modifications nothing more. But at least I can precise in the guide the parameters that they must change ? And I think you right for (at least) your LocalSettings.php because it doesn't work with me :-( And I have to test the solution from Dimitris. Best. julien. 2013/4/16 Jona Christopher Sahnwaldt < > uOn 16 April 2013 19:56, Julien Plu < > wrote: Just copy the parameters used in mysql.sh to the config file, without the leading ' uYes I think it's the best idea, and I think too that mysql is configured only by the \"/etc/mysql/my.cnf\" file. Best. Julien. 2013/4/16 Jona Christopher Sahnwaldt < > uOn 16 April 2013 20:21, Julien Plu < > wrote: Oh, I see. But there is a special section for client config paramers, so I guess the parameters from ./bin/mysql uActually MySQL should be looking for configuration files in different locations Cheers Andrea Il giorno 16/apr/2013 20:24, \"Julien Plu\" < > ha scritto: uThat depends where you install mysql, with the packages from ubuntu repositories it's located in /etc/mysql. I think the \"grant all \" line is useless, the best thing (and most easy) is to ask to users to use root user. Here how I see things : step 1 : install mysql step 2 : open my.cnf file (in mysql root directory if installed by hand or in /etc/mysql/ if installed with ubuntu packages) step 3 : add these parametesr in the [mysqld] section to have the utf8 encoding by default : character-set-server=utf8 skip-character-set-client-handshake step 4 : change max_allowed_packet=16M to max_allowed_packet=1G step 5 : change key_buffer=16M to key_buffer=1G instead. step 6 : change query_cache_size=16M to query_cache_size=1G These next step are made for those who installed mysql by hand : step 7 : set socket parameter to \"$MYDIR/mysqld.sock\" step 8 : set datadir parameter to \"$MYDIR/data\" step 9 : open your ~/.bashrc file to add : export MYDIR=/path/where/you/installed/mysql What do you think ? Best. Julien. 2013/4/16 Andrea Di Menna < > uOn 16 April 2013 20:55, Julien Plu < > wrote: Looks good! You're welcome to add these instructions to , maybe in a new section to keep the page manageable. Thanks! JC uDone ! Hope it's fine for you :-) Best. Julien. 2013/4/16 Jona Christopher Sahnwaldt < >" "Encoding error in german labels file" "uHi, Thanks for the update but the file Just curious, why the switch back to the non-international URI's?( uHi, On Fri, Sep 2, 2011 at 13:39, Gerber Daniel < > wrote: Which encoding errors are you referring to specifically? uI mean the encoding errors like the following, which I currently need to regex-replace with the corresponding latin characters. \"\u00C5ngstr\u00F6m (Einheit)\"@de . Cheers, Daniel On 02.09.2011, at 17:28, Max Jakob wrote: uI mean something like this. Which I'm currently have to regex-replace with the corresponding latin character :( \"\u00C5ngstr\u00F6m (Einheit)\"@de . Cheers, Daniel On 02.09.2011, at 17:28, Max Jakob wrote: uLike in the previous releases, characters beyond ASCII decimal 127 are unicode escaped since this is recommended for N-Triples. Cheers, Max On Mon, Sep 5, 2011 at 10:49, Gerber Daniel < > wrote: uHi Max, Thanks for this hint. This means that there are no umlaut characters allowed in the file? What's with the URIs? \"Nink\u014D\"@de . How can I \"un-escape\" those characters? Parsing them with Nxparser and Jena did not work. On 08.09.2011, at 11:14, Max Jakob wrote: uPlease try a Turtle parser. NTriple is a subset of Turtle, but turtle has unicode support. Sebastian Am 08.09.2011 12:03, schrieb Gerber Daniel: uorg.apache.commons.lang.StringEscapeUtils.unescapeJava(string); did the trick But still, why are the URIs not encoded this way? cheers, daniel On 12.09.2011, at 14:39, Sebastian Hellmann wrote: uHi Daniel, sorry for the late reply. I think your question is quite important as most issues with RDF seem to be encoding related. On 12. Sep. 2011, at 18:48, Gerber Daniel wrote: short answer: because that would have been too easy and it's not how it developed over time :) longer answer: RDF came along quite some time after URIs and initially was tied to its XML serialization. Now, xml already had it's way to escape non ascii values: the \uxxxx or \UXXXXXXXX . This kind of escaping for literals also made it into several other format such as ntriples, n3 and turtle. Now to the \"URIs\">> \"Nink\u014D\"@de . Your terminology here is not very precise, probably because we all tend to be lazy and just call everything URI which looks like one :). To explain this precise terms help: The is an \"IRI Reference\" (it even is an IRI as it's not relative). They were formally called \"RDF URI reference\" (not to be confused with \"URI reference\" from the URI RFC) in anticipation of the IRI RFC: > A URI reference within an RDF graph (an RDF URI reference) is a Unicode string [UNICODE] that: Now what does this mean? The is a Unicode string (!= UTF-8 String), which can be turned into a valid \"URI character sequence\" by following the steps described above. In order to dereference such an IRI we need to transform it into its URI equivalent and then use HTTP. In other words: From the IRI rfc sec. 1.2.a: > \"On the other hand, in the HTTP protocol [RFC2616], the Request URI is defined as a URI, which means that direct use of IRIs is not allowed in HTTP requests.\" This means that while it is allowed to identify things in RDF with IRIs it isn't possible to look them up without prior encoding as %-escaped UTF-8 string, which then is a (ASCII) URI. Now, you might remember that you can just copy the into your browser and get results. Correct, but that's because most browsers do the IRI -> URI magic under the hood so you don't see that they actually request Cheers, Jörn" "Summary of Association Hour in Leipzig" "uDear all, thank you for the very constructive meeting we had in Leipzig. Personally, I was quite happy since we are moving from a loose and slow-moving community to a better-organised form with clear goals and vision and the actual capacity to make public data better! # Membership benefits: - currently the main benefit of becoming a member association is the community effect, i.e. the individual contributions add up and provide a better DBpedia overall, which could not have been achieved by individuals, thus providing a general benefit for everybody - the question arose that there should be clearer message regarding incentive/advantages/benefits for individual members, especially targeting industrial members. Several ideas were given and will be consolidated in a document. # Next Language Chapters to join: - Japanese Chapter - Greek Chapter - Spanish Chapter # DBpedia Uptime: - Option 1: increase funding for more powerful hosting of the main endpoint , a rough figure by OpenLink: 99.99% uptime would be around 50k/year - Option 2: Community crowd-sourcing, i.e. uptime can be improved when we take off the heaviest users. For example, packaging DBpedia in docker and offering an easy way for configuration should help potential exploiters to do it on their own infrastructure, thus freeing resources to incidental (and less skilled) users. First steps are to create a list of public DBpedia endpoints and an official tutorials on setting up a DBpedia mirror. - Option 3: Freemium model: limit number of queries per IP (daily/monthly), which are offered for free. Additional queries require contract/subscription/contribution. # Collaboration with WikiData A major barrier to Wikidata using our data is that they do not contain any provenance information. WikiData will only accept data that has a reference, i.e. a clear source to where the claim is given outside of Wikipedia, as Wikipedia is only a collection of information itself, not a source. A clear action would be to extract more references from Wikipedia. # Citation and reference session - Add links to original sources/research work. This can also be reflected in provenance. - quite comprehensive work on the polish DBpedia, there will be a follow-up # NLP & DBpedia session - NE hot topic - two presentation from GSoC - research should focus on bigger problems # Ontology Session - task to research on extracting data with a difference scheme and see if inferencing is possible - Monika will choose an upper ontology or schema.org, create mappings to dbo - Dimitris will temporarily replace the mappings to the new ones and rerun extraction to get data with the new schema Developer session - overview about the dev process and the merging into the framework DataID - Integration RML- Integration Other Actions: - set up a task force for crowd-sourced coordinated hosting - Soeren will prepare a survey to assess direction where we are going - next meeting: application from Greece, Thessaloniki in Spring - add extraction framework to the discussion buckets - provide checksum for file - rethink bzip2 and switch to gzip uDear all, thank you for the very constructive meeting we had in Leipzig. Personally, I was quite happy since we are moving from a loose and slow-moving community to a better-organised form with clear goals and vision and the actual capacity to make public data better! I tried to sum up the most important points below. All the best, Sebastian # Membership benefits: - currently the main benefit of becoming a member association is the community effect, i.e. the individual contributions add up and provide a better DBpedia overall, which could not have been achieved by individuals, thus providing a general benefit for everybody - the question arose that there should be clearer message regarding incentive/advantages/benefits for individual members, especially targeting industrial members. Several ideas were given and will be consolidated in a document. # Next Language Chapters to join: - Japanese Chapter - Greek Chapter - Spanish Chapter # DBpedia Uptime: - Option 1: increase funding for more powerful hosting of the main endpoint , a rough figure by OpenLink: 99.99% uptime would be around 50k/year - Option 2: Community crowd-sourcing, i.e. uptime can be improved when we take off the heaviest users. For example, packaging DBpedia in docker and offering an easy way for configuration should help potential exploiters to do it on their own infrastructure, thus freeing resources to incidental (and less skilled) users. First steps are to create a list of public DBpedia endpoints and an official tutorials on setting up a DBpedia mirror. - Option 3: Freemium model: limit number of queries per IP (daily/monthly), which are offered for free. Additional queries require contract/subscription/contribution. # Collaboration with WikiData A major barrier to Wikidata using our data is that they do not contain any provenance information. WikiData will only accept data that has a reference, i.e. a clear source to where the claim is given outside of Wikipedia, as Wikipedia is only a collection of information itself, not a source. A clear action would be to extract more references from Wikipedia. # Citation and reference session - Add links to original sources/research work. This can also be reflected in provenance. - quite comprehensive work on the polish DBpedia, there will be a follow-up # NLP & DBpedia session - NE hot topic - two presentation from GSoC - research should focus on bigger problems # Ontology Session - task to research on extracting data with a difference scheme and see if inferencing is possible - Monika will choose an upper ontology or schema.org, create mappings to dbo - Dimitris will temporarily replace the mappings to the new ones and rerun extraction to get data with the new schema Developer session - overview about the dev process and the merging into the framework DataID - Integration RML- Integration Other Actions: - set up a task force for crowd-sourced coordinated hosting - Soeren will prepare a survey to assess direction where we are going - next meeting: application from Greece, Thessaloniki in Spring - add extraction framework to the discussion buckets - provide checksum for file - rethink bzip2 and switch to gzip" "dbpedia SPARQL Endpoint error" "uHello, It seems there is a connection error when running a query from dbpedia endpoint: error trace: 08C01 Error CL: Cluster could not connect to host 2 22202 error 111 Pierre-Yves Vandenbussche. Hello, It seems there is a connection error when running a query from dbpedia endpoint: Vandenbussche. uHi Pierre, Apologies, there was a resource overload the dbpedia sparql endpoint is back online, so please try again Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 14 Oct 2010, at 10:34, Pierre-Yves Vandenbussche wrote: uThank you. Back to semantic :) Pierre-Yves Vandenbussche. On Thu, Oct 14, 2010 at 12:51 PM, Hugh Williams < >wrote:" "Data set" "ugtummarello wrote: I guess this kind of discussion fits better to the DBpedia mailinglist (which I include in my reply) since as I understood the BT-Challenge will provide a dataset for download anyway and not refer to LOD endpoints. Within DBpedia Chris and Kingsley are the ones who care about the SPARQL endpoint and I guess they will be happy to comment on this. I suppose it does (due to performance issues) not make sense to host all DBpedia datasets within one single endpoint. Sören uHi Giovanni, Could you give an example use case where is it useful to have that dataset served as linked data? I created it for the purpose of statistical analysis and that's done only locally. version? Well, no, because I do not see any benefit and it would only slow down the linked data access. In my opinion Linked Data is about providing useful information, and this is hopefully not only a pure scalability contest. Cheers, Georgi uSören Auer wrote: uOn 6 Dec 2007, at 21:36, Kingsley Idehen wrote: Just to be clear: The only dataset not served in the SPARQL endpoint and linked data is the pagelinks dataset. If we added the pagelinks into the linked data, then many resources would have ~100 pagelinks attached, compared to ~50 other, more interesting triples. Some user interfaces do not cope well with that. The interesting properties would be lost in the noise. The pure byte size of each HTML and RDF document would also increase significantly. This would further slow down response times. Considering that the pagelinks don't add much value for browsing or SPARQL queries, I believe that the decision not to serve the pagelinks is sound. Richard uRichard Cyganiak wrote: On that basis , for sure! We need to make it easier people to query against DBpedia, anything that diminishes this goal is a detraction and ultimately detrimental to the overall project. Soren: You know I can't let potential performance and scalability misconceptions go unanswered :-) Kingsley" "Looking for churches in Paris - skos:broader transitivity and dbpedia categories" "uHi all I wanted to get at the following data set through \"Churches in Paris, along with French name and description, and geo-tagging elements\" The category \"Churches in Paris\" has typically two levels of subcategories, by \"Religion\", then by \"Arrondissement\" So the following query PREFIX p: PREFIX dbpedia: PREFIX category: PREFIX rdfs: PREFIX skos: PREFIX geo: SELECT DISTINCT ?m ?n ?p ?d WHERE { ?m rdfs:label ?n. ?m skos:subject ?c. ?c skos:broader category:Churches_in_Paris. ?m p:abstract ?d. ?m geo:point ?p FILTER ( lang(?n) = \"fr\" ) FILTER ( lang(?d) = \"fr\" )} yields only the two results which are in a direct subcategory. Which means skos:broader transitivity is not supported. OK. This leads to several remarks and questions. SKOS version declared in the namespace is the 2004 version, which is now superseded by the new 2008 SKOS specification and namespace. For those not aware of the long debate history in SKOS about broader transitivity, the sum up is as following skos(2004):broader is transitive skos(2008):broader is no more transitive. It links only to the direct parent. skos(2008):broaderTransitive is a superproperty of the above, and it is transitive - like skos(2004):broader To sum it up, the current dbpedia interface declares skos:broader in the 2004 namespace, but de facto applies the 2008 semantics (no transitivity). This is indeed confusing, and given there has been already a lot of confusion in people minds about that affair, it's too bad. Has dbpedia in project to switch from SKOS 2004 to SKOS 2008 namespace and semantics, and if yes when? And in such a case case, will it support skos2008:broaderTransitive queries? As a side note, skos:subject is deprecated in SKOS 2008. Back to my churches in Paris, is there currently a workaround to get the results for all subcategories in a single query? Thanks for your attention. uBernard, I don't know about plans regarding the namespace. Just one thing: Wikipedia categories are not transitive, and applying transitivity inference would produce undesirable results. This is because the category system is simply not a well-designed hierarchy, but rather a tagging system where categories can be tagged as well. For example, if we applied transitive inference, then category:Berlin would become a subcategory of category:Mexico, category:Asia, and category:Jesus. Don't ask why;-) Best, Richard On 21 Oct 2008, at 12:27, Bernard Vatant wrote: uRichard Cyganiak a écrit : Suppose so. Maybe I was naively dreaming of some \"local transitivity\" (say, 2 or 3 steps ) Yep. With neither loop nor meta-level consistency control > For example, if we applied transitive inference, then category:Berlin Well, I did not figure it was *that* bad Makes me wonder if using skos:broader to express dbpedia categories \"hierachy\" is a good idea at all Bernard" "Problem with wikiPageWikiLink" "uHi all, I am trying to use the DBPedia by using the access point I am trying to run some queries for using the the pages linking to a given label. For instance, I tried the following query: select distinct count(?l) ?s from from where { {?s < \"i\")}. ?s < order by DESC(count(?l)) limit 10 However, it does not working. Can anyone provide me a simple example to use wikiPageWikiLink in a correct way? Thank you very much in advance for your time. Best Regards. Serena" "Dbpedia Lookup Improvements Introduction and Wikipage Link" "uHello Congratulations everyone for getting selected in GSoC 2016. I am Kunal and I will be working on DBPedia Lookup Improvement. I have had an initial discussion and brainstorming meeting with my mentor and I have created a page[1] on the DBpedia git repository. The page(as of date) contains the initial shape of the idea. I am yet to have a full team meeting after which the page will be more informative and precise about the project. I will be updating the page by every Friday 10:00 (UTC). Looking forward to having a great summer !! Regards Kunal Jha On Thu, May 12, 2016 at 11:36 AM, Kunal Jha < > wrote: uWelcome Kunal, DBpedia lookup is a low-profile but probably one of the most used DBpedia projects since every time it goes down we get many complains :) The new version Kunal will deliver will bring a lot of improvements but any other feedback people that use lookup may have will be more than welcome Cheers, Dimitris On Thu, May 12, 2016 at 1:29 PM, Kunal Jha < > wrote:" "DBpedia Live End points return different data for same query" "uHi, I ran the following query against the two DBPedia live endpoints mentioned on the DBpedia Live website ( Query: PREFIX  dbo: PREFIX  rdf: PREFIX  dbpprop: SELECT  count(*) FROM WHERE   { ?person rdf:type dbo:Person .     ?person dbpprop:name ?name .     ?person dbo:birthDate ?birthDate .     ?person dbo:abstract ?abstract .     ?person dbo:wikiPageID ?wikiPageID .     ?person dbo:wikiPageRevisionID ?wikiPageRevisionID     OPTIONAL       { ?person dbo:wikiPageModified ?wikiPageModified }     OPTIONAL       { ?person dbo:wikiPageExtracted ?wikiPageExtracted }     FILTER langMatches(lang(?abstract), \"en\")   } Result when run on end point Result when run on end point Thanks, Shruti Hi, I ran the following query against the two DBPedia live endpoints mentioned on the DBpedia Live website ( < OPTIONAL { ?person dbo:wikiPageModified ?wikiPageModified } OPTIONAL { ?person dbo:wikiPageExtracted ?wikiPageExtracted } FILTER langMatches(lang(?abstract), \"en\") } Result when run on end point Shruti" "Relationship between a category (skos:Concept) and the matching resource" "uHello, I am planning to train topic models to recognize text content and find the most related skos:Concept from DBpedia. However most of the time the dbpedia nodes for categories / skos:Concept are less informative than the resource assiocated with the main article of the afore mentionned category in wikipedia. Let's take an example, the category \"History\" is available as a skos:Concept in dbpedia as: - It matches the following category page in wikipedia: - On that page you can see the text: The main article for this category is History. Which is a link to the following wikipedia page: - Which has the following RDF resource in DBpedia: - However the relationship: is not directly available in the DBpedia graph. So my questions are: 1- Is it safe to assume that for any in the matching 2- Is there a plan to make that relationship more explicit? Regards, uHi all, One year later I am hitting this issue again (see quoted email below for reference). Since last time it is now possible to improve the mappings using category information is extracted from there. Maybe the SKOS mapping logic is hardcoded in the extractor? To recap, the piece of data I am looking for is the link between a SKOS topic such as DBpedia resource that is matching the primary Wikipedia article of the category, in that case The Wikipedia makes this information explicitly available by the use of the template \"Cat_main\". For instance, the source of http:/en.wikipedia.org/wiki/Category:Arts includes the following snippet (generally at the beginning): {{Cat main|The arts}} It seems to be widely used for any category that has a real world semantic interpretation (not just for the sake of Wikipedia housekeeping): So my question is: is it possible to write a mapping for this using which target property should I map the relation too? Best, 2010/4/16 Olivier Grisel < >: uOn Tue, Jul 19, 2011 at 19:22, Olivier Grisel < > wrote: Correct. Article categories and the category hierarchy both have their own extractors. The choice of relation name is completely up to the mapper. At the moment, it is not possible to extract an object property for this, because \"The arts\" does not have a wiki link to its page, i.e. the \"[[\" and \"]]\" are missing. You can, however, write a mapping for the template Cat_main that extracts a string that contains this PropertyMapping: {{ PropertyMapping | templateProperty = 1 | ontologyProperty = wikiCategoryMainResource }} The result would be suboptimal: \"The Arts\" In order to be able to extract the better triple (as an object property) http://dbpedia.org/ontology/wikiCategoryMainResource http://dbpedia.org/resource/The_Arts you would have to extend the mapping language. A flag that tells the extraction to always treat everything like a URI suffix would work. Or the more general solution, suggested by Pablo, to include something like a URI pattern that lets you specify a URI with a place holder in which found string is inserted. Links to external datasets could also benefit from this solution. Of course, you can also write separate extractor for this. Cheers, Max uThanks for your answers. A URI template mapper would indeed be useful IMHO. Maybe it should take URL escaping into account (at least the Wikipedia URL rules such as the special handling of whitespace chars that are replaced by underscore chars)." "variable bindings in SPARQL query" "uHi! Is variable binding supported by the DBpedia SPARQL endpoint (or Virtuoso)? When I try to ask the genre of musician Josh Turner, the following SPARQL query is generated with sesame: \"queryLn=SPARQL&query;= SELECT * WHERE { ?person ?Y} LIMIT 7 &infer;=true&$?person= \" musician-genre pairs are returned. When I replace ?person with Josh Turner's IRI, Country music is returned correctly. I uploaded the sample code: So is parameter binding supported? (Or is it relevant, like in SQL to defense against SPARQL injection, or like? Or from query optimisation point of view?) Thanks, C.M." "How to get contributors of wiki articles" "uHi, I am working on some data mining project and want to get the information about wikipedia articles and their contributors(editors). There used to be a tool to get this information but it's not working any more. Does anybody know about this? Thanks for your help in advance. Best wishes, Jianpeng uHi Jianpeng, we have recently adopted an extractor to extract the contributors information of Wikipedia pages. You can find the contributors information in our DBpedia-Live endpoint available at On 03/12/2012 03:43 PM, Jianpeng Xu wrote:" "NER for Czech (was dbpedia for Czech language)" "uHi Václav, Thanks for the introduction. You may want to take a look at existing NER solutions, such as the Stanford NER or the one from the Mallet toolkit, and try to train them for Czech. You may also be able to benefit from DBpedia Spotlight, which performs entity (and concept) extraction and annotation. Our output is a superset of NER, since it segments the input, assigns types and further assigns unique identifiers to each entity as well as concept. We are working on the internationalization, so your help with Czech would be most welcome. Cheers Pablo On Sep 29, 2011 3:08 PM, \"Václav Zeman\" < > wrote: uHi, Thanks for the answer. I am able to provide a team to help with DBpedia internationalization (e.g. adding new Czech mappings etc). Our team is working on semantic web research for Czech, not only NER. Therefore our interest in Czech DBpedia internationalization is great. Thanks Now, our working on the NER solution is only just beginning and we are looking for a optimal auxiliary tool. DBpedia is just the simplest solution for us. Václav From: Pablo Mendes [mailto: ] Sent: Friday, September 30, 2011 9:39 AM To: Václav Zeman Cc: Subject: NER for Czech (was Re: [Dbpedia-discussion] dbpedia for Czech language) Hi Václav, Thanks for the introduction. You may want to take a look at existing NER solutions, such as the Stanford NER or the one from the Mallet toolkit, and try to train them for Czech. You may also be able to benefit from DBpedia Spotlight, which performs entity (and concept) extraction and annotation. Our output is a superset of NER, since it segments the input, assigns types and further assigns unique identifiers to each entity as well as concept. We are working on the internationalization, so your help with Czech would be most welcome. Cheers Pablo On Sep 29, 2011 3:08 PM, \"Václav Zeman\" < > wrote: uVáclav, Thanks for your interest on DBpedia internationalization. I was rather pointing you to *another* internationalization effort. Please note the subtlety: the DBpedia Extraction Framework produces the DBpedia Datasets which are used in conjunction with Wikipedia pages by DBpedia Spotlight to create a text annotation system. Extraction Framework: Dataset: Text Annotation Tool: The DBpedia Internationalization and the DBpedia Spotlight Internationalization are two related but distinct efforts: You are welcome to contribute to both. Best, Pablo 2011/9/30 Václav Zeman < >" "DBpedia Spotlight 0.7" "uDear DBpedia users and developers, DBpedia Spotlight is a tool for connecting text to DBpedia through the recognition and disambiguation of entities and concepts from the DBpedia KB. We are happy to announce version 0.7 of DBpedia Spotlight, which is also the first official release of the probabilistic/statistical implementation. If you use this version of DBpedia Spotlight in your research, please cite the ISEM13 paper (bibtex included at the bottom). Joachim Daiber, Max Jakob, Chris Hokamp, Pablo N. Mendes. Improving Efficiency and Accuracy in Multilingual Entity Extraction. At ISEM2013. The changes to the statistical implementation include: - smaller and faster models through quantization of counts, optimization of search and some pruning - better handling of case - various fixes in Spotlight and PigNLProc - models can now be created without requiring a Hadoop and Pig installation - UIMA support by @mvnural - support for confidence value See the release notes at [1] and the updated demos at [4]. Models for Spotlight 0.7 can be found here [2]. Additionally, we now provide the raw Wikipedia counts, which we hope will prove useful for research and development of new models [3]. A big thank you to all developers who made contributions to this version (with special thanks to Faveeo and Idio). Huge thanks to Jo for his leadership and continued support to the community. Cheers, Pablo Mendes, on behalf of Joachim Daiber and the DBpedia Spotlight developer community. (This message is an adaptation of Joachim Daiber's message to the DBpedia Spotlight list. Edited to suit this broader community and give credit to him.) [1] - [2] - [3] - [4] - in Multilingual Entity Extraction}, author = {Joachim Daiber and Max Jakob and Chris Hokamp and Pablo N. Mendes}, year = {2013}, booktitle = {Proceedings of the 9th International Conference on Semantic Systems (I-Semantics)} } Dear DBpedia users and developers, DBpedia Spotlight is a tool for connecting text to DBpedia through the recognition and disambiguation of entities and concepts from the DBpedia KB. We are happy to announce version 0.7 of DBpedia Spotlight, which is also the first official release of the probabilistic/statistical implementation. If you use this version of DBpedia Spotlight in your research, please cite the ISEM13 paper (bibtex included at the bottom). Joachim Daiber, Max Jakob, Chris Hokamp, Pablo N. Mendes. Improving Efficiency and Accuracy in Multilingual Entity Extraction. At ISEM2013. The changes to the statistical implementation include: - smaller and faster models through quantization of counts, optimization of search and some pruning - better handling of case - various fixes in Spotlight and PigNLProc - models can now be created without requiring a Hadoop and Pig installation - UIMA support by @mvnural - support for confidence value See the release notes at [1] and the updated demos at [4]. Models for Spotlight 0.7 can be found here [2]. Additionally, we now provide the raw Wikipedia counts, which we hope will prove useful for research and development of new models [3]. A big thank you to all developers who made contributions to this version (with special thanks to Faveeo and Idio). Huge thanks to Jo for his leadership and continued support to the community. Cheers, Pablo Mendes, on behalf of Joachim Daiber and the DBpedia Spotlight developer community. (This message is an adaptation of Joachim Daiber's message to the DBpedia Spotlight list. Edited to suit this broader community and give credit to him.) [1] - }" "DBpedia user, who are you?" "uHi all, I'm currently doing some planning for the future roadmap of DBpedia, and therefore gathering requirements and use cases. So I'm wondering: - Who is using DBpedia today or has evaluated it in the past, - What are you doing with it or how would you like to use it, - How would you like to see it evolve? Especially interested in usage of DBpedia (and Linked Data) within organizations or even commercial scenarios. Please let me know, either on-list of off-list (and state in case you don't want that information to be disclosed). Thanks, Georgi uGeorgi Kobilarov wrote: Dear folks, I'm Davide Palmisano, an Asemantics[1] senior researcher and I'm very pleased to reply to Georgi's questions. Currently we are using DBPedia within two disjoint main scenarios. the first one is related to the EU project called NoTube[2] where we are planning to use DBpedia as a main knowledge core to build semantic web based user profiles in order to make personalized TV content recommendation. This is a research project mainly aimed to produce innovative algorithm for the content discovery. The second one, partly covered by an NDA so I cannot be more precise, is an ambitious project that we will present to the next SemWeb09 called 99ways[3] where we are planning to make an intensive use of DBpedia. For example, we are currently making an autocompletion service that taking as input a substring it returns a list of DBpedia URIs grouped by their most representative skos:subject. The way we are calculating the most representative skos:subject for each URI is the key point within the overall algorithm. oops, as the precedent one :) Grow, grow and grow! Jokes apart, the first real and important evolution that comes up in my mind is partially related to the uptime and to the scalability of the system. Improving the scalability of the SPARQL end point backend would be the key task to allow the resolution of very frequent and complex SPARQL queries. all the best, Davide uDavide Palmisano wrote: Nice to get the very first response from you :-) Increasing the scalability of the SPARQL endpoint is a function of: 1. Actual Virtuoso instance (we currently use the Single Server Open Source Edition rather than the Commercial Cluster Server Edition) 2. Controls deliberately put in place on the server side to protect the public endpoint e.g. queries generating large result sets etc3. Reality of Web Scale (the public endpoint is just that a free public endpoint on the Web, you need a service specific variant for your varied SLA type requirements hence the creation of DBpedia AMIs [1]). Links: 1. VirtEC2AMIDBpediaInstall uHi Georgi Still some examples in the XQuery wikibook of using SPARQL with DBPedia, although in some cases they have broken because of changes in wiki categories. The main application which seems to be used a bit is the category- based picture browser (partly because its featured on the Brooklyn Museum API gallery since I mashed up their exhibits) (I like the Towns on the River Severn map) Its only running on a small server here and it uses about 12 different SPARQL queries. The Alphabet view is for a grandchild to make Alphabet posters Chris This email was independently scanned for viruses by McAfee anti-virus software and none were found uHello! On Wed, May 20, 2009 at 12:04 PM, Georgi Kobilarov < > wrote: Glad to contribute to that :-) We are using DBpedia in quite a lot of services at the BBC, as detailed in our ESWC paper [1]. I am also using it in almost all the services hosted at dbtune.org. Wrt. future plans, here are a couple of things that would be very great to have in future versions of dbpedia: 1) Query by example. You submit a bunch of DBpedia resources, and it returns a SPARQL query selecting them and resources with similar properties. 2) Live update from Wikipedia (but it seems quite close to being real, now) 3) An interface for submitting out-going links, instead of having to ping the dbpedia list each time Cheers, y [1] uGeorgi, What I'm working on is far below the scale of what others have said so far, but here goes anyway! :) I'm working on a way to let university faculty create profiles of the classes they teach, in part using DBpedia resources as common references for topics, tools, or other things they use in their teaching (and interests for the faculty and students themselves). (If anyone's interested: a quick slideshow from a talk [1] and a longer post [2]. Code is all still sketchy, but here's [3] the data input form I'm working on, and an example [4] of using topics to find courses) So I'm using DBpedia to disambiguate concepts in the app, making heavy use of the lookup service (it's fantastic, BTW!), as well as to bring additional data into the app. On how I'd like to use it/seeing it evolve, I'm not sure if this goes against any of the design principles you are working from, but mechanisms and/or guidance to smooth out some of the oddities of the data in wikipedia would be wonderful. Take for example the sculpture The kinds of quirks I'm seeing are on the year property, c. 130 CE. In the wikipedia page, both 130 and CE have been made links to wikipedia entries. So DBpedia lists two dbp:years as resources for 130 and CE And often enough a dbp:year property comes through more as I'd expect it, as a literal. Similar examples of hard-to-predict data due to different individual linking choices in wikipedia are pretty common. And most can be worked through on my end, for example by looking for a lang attribute. But the overall evolution that I think would make it easier on the usage end is more predictability in the data coming back uAt Turn2Live.com we will start (soon) to consume music data about artists from LOD and obviously from DBpedia. We are at a very initial phase. Hope to have demos soon! Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Wed, May 20, 2009 at 1:04 PM, Georgi Kobilarov < >wrote: uHoi, I have been evaluating DBpedia. What I would like to see is use DBpedia to improve the quality and consistency of Wikipedia itself and make use of its curated data for an experiment where Semantic MediaWiki will be enabled for a copy of English language Wikipedia. This would mean that DBpedia would get more integrated with its source and consequently it would become easier to improve its data quality. Thanks, GerardM 2009/5/20 Georgi Kobilarov < > uYves Raimond wrote: Re. pinger services for SPARUL type effects, the availability of a FOAF+SSL based DBpedia SPARQL endpoint will make this feasible. And for those that don't have WebIDs (URIs), OAuth based SPARQL endpoint will do. Kingsley uHi Georgi, hi all, first of all, thanks to the DBpedia team - you make a great job! My name is Andreas Blumauer, and we (punkt. netServices [1]) have been working on a SKOS based thesaurus management system called PoolParty [2] in the last two years. Some additional facts about the system can also be found here [3]. Besides all open trainings, in-house seminars etc. we do together with folks from Semantic Web Company [4], where we demonstrate DBpedia as a \"best practice\", we use DBpedia (and in the future also other datasets) to link \"local\" SKOS concepts with resources from the LOD cloud to enrich concepts from the local thesaurus (using skos:extactMatch). Therefore we use Georgi´s lookup service at the moment [5]. This additional information can be used for several purposes, e.g. to expand a concept and its subconcepts. If authors wish, thesauri edited with PoolParty can also \"exposed\" to the LOD Cloud (SPARQL endpoint, Pubby). See how it works on a short Youtube video [6]. We would appreciate very much if DBpedia will expand its multilingualism in the next period of time. Any plans? Thanks, Andreas [1] [2] [3] [4] [5] [6] watch?v=qpf8sk97gMw" "Problem with Sparql Endpoint" "uHi All I am using Jena to hopefully query the DBPEDIA ontology for data about a specific person I am using the following query PREFIX foaf: SELECT ?name ?uriWHERE { ?uri a foaf:Person. ?uri foaf:name ?name. FILTER (langMatches(lang(?name), \"en\")). FILTER( regex(?name, '^Paula Creamer', 'i') ) }* The code I am using is as follows, am I right in thinking that because I do not have a local ontology their is no need for a model to be present? I just wish to query the data. *out.println(\" QUERYBUILDER \"); // retrieve parameters String param = request.getParameter(\"param\"); out.println(\" THIS IS THE PARAMETER PASSSED IN: \" + param + \" \"); Query query = null; // set query string String queryString = \"PREFIX rdf: < * + \"PREFIX rdfs: \" + \"PREFIX foaf: \" + \"SELECT DISTINCT ?name ?uri \" + \"WHERE{\" + \"?uri a foaf:Person.\" + \"?uri foaf:name ?name.\" + \"FILTER(langMatches(lang(?name),'en')).\" + \"FILTER(regex(?name,'^\" + param + \"','i') )}\"; query = QueryFactory.create(queryString); out.println(\" query created \"); QueryExecution qexec = QueryExecutionFactory.sparqlService(\" * out.println(\" query executed \"); ResultSet results = qexec.execSelect(); out.println(\" iterator executed \"); try { while(results.hasNext()){ int x = 0; QuerySolution soln = results.nextSolution(); Resource r = soln.getResource(\"uri\"); out.println(\"result: \" + x); x ++; } } catch (Exception e) { out.println(e.getCause()); } finally { qexec.close(); }* This code is all ran through a very basic servlet at the moment (trying to keep it simple for experimentation purposes), when I run the code and pass in someone's name that I know is available such as 'Angelina Jolie' I get an HttpException: 500 SPARQL Request Failed* error does anyone see where I am going wrong, thanks in advance." "Broken Pages for German Cities" "uHi all, It seems that the dbpedia.org endpoint started importing German language dbprop attributes for German cities. However this is causing the Sparql queries that include German dbprop properties with German Umlauts to break(we had similar problems on the German DBpedia endpoint at de.dbpedia.org, however using the Internationalized extractor and Virtuoso 6.1.4 fixed it). As a result, the XML/RDF files of those cities are broken too, as you can see here: Kind Regards, Alexandru Todor This issue was reported originally to the dbpedia-germany mailing list by Alina Weber, I'm forwarding it since she's waiting for 2 weeks to get approval to the dbpedia-discussion mailing list (it would be nice if someone could check the queue). uAlexandru, Thanks for forwarding. There is no approval for subscribing to dbpedia-discussion. Cheers, Pablo On Fri, Mar 16, 2012 at 12:13 PM, Alexandru Todor < >wrote: Alexandru, Thanks for forwarding. There is no approval for subscribing to dbpedia-discussion. . uHi Pablo, No problem, and thanks for your usual fast response. I was also wondering about the approval thing, I will forward your answer to the relevant topic on the dbpedia-germany mailing list. Cheers, Alexandru On 03/16/2012 12:42 PM, Pablo Mendes wrote: wondering about the approval thing, I will forward your answer to the relevant topic on the dbpedia-germany mailing list. Cheers, Alexandru On 03/16/2012 12:42 PM, Pablo Mendes wrote: Alexandru, Thanks for forwarding. There is no approval for subscribing to dbpedia-discussion. Alexandru Todor < > wrote: Hi all, It seems that the dbpedia.org endpoint started importing German language dbprop attributes for German cities. However this is causing the Sparql queries that include German dbprop properties with German Umlauts to break(we had similar problems on the German DBpedia endpoint at de.dbpedia.org , however using the Internationalized extractor and Virtuoso 6.1.4 fixed it). As a result, the XML/RDF files of those cities are broken too, as you can see here: mailing list by Alina Weber, I'm forwarding it since she's waiting for 2 weeks to get approval to the dbpedia-discussion mailing list (it would be nice if someone could check the queue)." "New Mapping Chrome extension" "uHi all, We just pushed a *VERY BASIC* chrome extension in Git to help mapping editors start a new template mapping. unpacked extension\") and point to the extension folder. To test it go to language), find an uncreated infobox (red/grey) and click edit. Once you create the page, the extension will fill the edit text with basic data coming from template statistics. This is not rocket science, true, but it helped me a lot when adding new mappings :-) Hope this can help other editors as well. Of course, everyone is free to edit/add/remove/change and suggestions are more than welcome. Happy mappings! Andrea Hi all, We just pushed a *VERY BASIC* chrome extension in Git to help mapping editors start a new template mapping. Andrea" "Warm greetings and a few issues to resolve on the DBpedia wiki" "uHello DBpedians, I am Soumen Ganguly, a first year masters' student at Saarland University, Germany. First of all, I'd like to congratulate the DBpedia community on doing such wonderful work on maintaining such an informative infrastructure. I came across DBpedia very recently and was struck instantly by its immense usefulness, and so I wanted to get involved with the community behind this work. I looked through the wiki on ways of getting involved and stumbled upon the fact that it is not complete in all aspects. There are outdated information, broken links and inconsistent information between the wiki hosted at Github and the official DBpedia wiki(though it complements each other). So, I thought what better way to contribute than to improve the documentation on 'How to contribute and get involved'. I wanted to make changes in the wiki but did not find enough information on how to create an account. So, I'd be glad if someone could give me some pointers to go ahead with creating an account and making changes. Thanks a lot again for the information and looking forward to being a part of the wonderful community. Cheers, Soumen uThis is awesome Soumen, your offer is really appreciated, I will send you account details in a separate mail. Other people interested to help with the website just ping me to give you an account Cheers, Dimitris On Tue, Sep 20, 2016 at 5:01 PM, Soumen Ganguly < > wrote:" "DbPedia is down" "uHello! Is there any estimation when dbpedia.org is available? Regards, Alexander Hello! Is there any estimation when dbpedia.org is available? Regards, Alexander uHi Alexander, Dbpedia is online, please confirm if you are still having problems accessing it ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 29 Nov 2010, at 04:58, Alexander Sidorov wrote:" "200 OK and URIs" "uGreetings dbpedians, In doing some linking of metadata here at the Library of Congress to dbpedia [1,2] we came up with a couple of questions: 1. Is it right that a URI for a resource that doesn't have a corresponding resource in WIkipedia returns a 303 See Other to application/rdf+xml and text/html URIs, which in turn 200 OK? :~$ curl uEd, On 11 Aug 2009, at 15:54, Ed Summers wrote: uRichard Cyganiak wrote: uOn 11 Aug 2009, at 17:08, Kingsley Idehen wrote: Ok cheers, I was not sure wether this can be done via tweaking the mapping rules. Best, Richard uRichard Cyganiak wrote: There is a handler associated with the mapping rule that needs fixing, it should test the information resource URL it generates before returning a response code. If you're familiar with Virtuoso's Conductor UI, you will notice that we have drop down labeled: \"HTTP Response Code\", with \"internal redirect\" as one of the drop down values. When selected, this implies that HTTP responses are coming from a custom handler. Kingsley uOn Tue, Aug 11, 2009 at 1:41 PM, Kingsley Idehen< > wrote: Kingsley, Richard uEd Summers wrote:" "not climate related bug in the arctic ocean" "uhi, looking at this page : i find two floats for geo:lat and geo:long (90.000000 and 0.000000). however this page : 90 for geo:lat and 0 for geo:long. i haven't tested it with other floats, but looks to me like a bug with datatypes. any clues? wkr www.turnguard.com uHi, I think 0 and 0.000000 are equivalent representations of the same xsd:float value. Christopher On Tue, Oct 27, 2009 at 12:01, Jürgen Jakobitsch < > wrote:" "can't find Goethe using new faceted browser." "uhello: I can't find Johann Wolfgang von Goethe using new faceted browserstarting with \"Johan\" in the Persons section, but I can find him by starting with birthdate can someone tell me why? thanks, michael grobe uGrobe, David Michael wrote: uKingsley, Definitely without a good UI, it is very hard to use this. When I get to step 2, I see 4312 types. Am I suppose to go through all of these and select the ones I need? There should be a unselect all button. How can I use lod.openlinksw.com to FIND all actors who were born in NYC and died in 2003 in NYC? it is very easy to do it with the dbpedia faceted browser: Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Sat, Sep 26, 2009 at 7:07 PM, Kingsley Idehen < >wrote: uMichael, On 26 Sep 2009, at 23:08, Grobe, David Michael wrote: It seems you can find him by searching for \"johann\" or \"goethe\" in the persons section, but not searching for \"johan\", \"jo\" or \"goet\". I suppose the faceted browser does only exact matches on words, and doesn't search for partial matches and doesn't correct misspellings. Best, Richard uJuan Sequeda wrote: But please understand, we built a faceted \"search and find\" engine with an *API for Linked Data application developers* to use [1]. The basic UI was about demonstrating the capabilities of the underlying engine which comes down to the time-less challenge of scalable faceted search, find, and reasoner based data reconciliation over Billions of triples. Searching on pattern: Johann, using either faceted \"search and find\" UI is how you get to the essence of the matter addressed by our engine level (which I also outlined in my initial response to Christian). Your response falls into to the same category as those seeking a better DBpedia UI from the same folks responsible for preparing and publishing its Linked Data corpus. I maintain that Linked Data uRichard: But that's the strange partUsing the new browser, when I click \"Person to filter by \"Person\", I get a list of predicates, including \"name\". Then when I enter \"jo\" in the \"name\" text field, a pull-down list of names appears, starting with \"John\", \"John Davis\", etc. If I enter \"john r\", I get a pull-down containing \"John Russell(7) and John Reiley (6)\", and if I actually enter \"John Davis\", I get a list with a single \"John Davis (18)\" entry, which I can select for the John Davis content but \"johann\" gets me nothingnot even \"Bach or Pachelbel\". Very odd; it's as if Goethe is not in the database, or maybe German names are not being processed (neither of which sound likely). :michael grobe uOn 28 Sep 2009, at 17:09, Grobe, David Michael wrote: Ah, now I see what you mean. I think that the auto-completion in the facet fields just covers the top N values for the facet. You find \"John Russell\" and \"John Reiley\" because there are several entries with that value (that is, people with that name uMichael, Richard, you're correct, currently only the top values of each facet are used for auto suggestion, and the selected value must match exactly. When I find the time, I will extend the browser to use inexact matching for fields like 'name'. You can use the text search box at the top of the page to search for \"Johann\", though: Johann Wolfgang von Goethe is the second result. Christopher On Mon, Sep 28, 2009 at 19:32, Richard Cyganiak < > wrote: uChristopher: I suppose this is a different issue, but since you're \"on the line\", I'd like to know if the interface supports searches on ontology ranges, so that I could, for example, search for any gene annotated with any Gene Ontology category \"descending from\" GO:0005488. or for any gene annotated with a GO category \"between\" GO:0005488 and GO:0008047, or any gene within 2 hops from GO:0005488, yada, yada. Thanks, :Michael uHi Michael, we currently do not extract Gene Ontology data from Wikipedia. We extract data for which doesn't use Gene Ontology. The template (and maybe others?) uses Gene Ontology, but we don't have a mapping from that template to our ontology (yet). Besides, most (if not all) of the IDs we already extract (ISBN, Protein Data Bank ID, etc.) are defined as strings, not as numbers, because with most of these, ranges and other numeric operations don't make sense. Would it actually make sense to search for genes annotated with a GO category \"between\" GO:0005488 and GO:0008047? Not sure what you mean by \"descending from\" and \"2 hops from GO:0005488\", but I guess the DBpedia extraction and/or the browser would have to use some kind of inference to enable such searches. There are other domains in which such inference would be useful (e.g., the browser doesn't know that someone who was born in Berlin was also born in Europe), so maybe one day we will implement it, but it's currently not high on our to-do list. Bye, Christopher On Thu, Oct 1, 2009 at 17:54, Grobe, David Michael < > wrote: uChristopher: Thanks for you thoughtful reply; it highlights important issues. I think I could have been clearer about my questions, but I was not sure of the context, and I'm not an expert in this area, so here are my \"speculations\": I think the issue is pretty general, in that it applies to ontologies other than GObecausealmost all ontologies define categories that are linked together in a hierarchical (or DAG) structure. So think of the (an) evolutionary \"tree\". The components (nodes) are linked (by edges) into a tree structure. (a subset of a graph) The names of each component either don't have type, or we don't care what the type is because they are just names, and the type conveys no \"relationship\" (as with the implicit relationships among integers). But just as you suggested with regard to cities, the relationships among the nodes may sometimes be useful for \"inference\", and \"mechanically,\" all that means is navigating an ontology as part of the search process. Suppose for example that you have a database of organisms that includes a genus-species for each organism, AND you have the (an) evolutionary tree at hand built from all organism categories from \"root\" to \"genus-species\". Then you can use the tree to help answer questions like \"Show me all the mammals\", because you can traverse the ontological tree \"down\" from mammal to the lowest level (\"genus-species\"?), and then list ANY organism from the database annotated with ANY category in the ontology \"below\" mammal. Now in the case of GO, things are a little different, but the ideas are similar. I might for example want a list of genes that have been annotated as being involved in any \"biological process\" that is a \"part_of\" (as indicated using the \"part_of\" predicate) a specific process. That is, I want to find any gene annotated as being part of any process that is one hop \"below\" the specific/target process. With respect to the \"is_a\" relationship within the \"molecular function\" portion of GO, I might want to ask for genes annotated as being types of \"enzyme activator activity\" (GO:0008047), so I might like to find genes annotated with GO:0008047 AND any category \"one hop below\" 8047 to find genes that are known to display more specific biochemical enzyme activity like \"helicase activity\" (GO:0004388) and \"ATPase activity\" (GO:0016887). (I'm using a mutated version of GO for this example, by the way; these relationships are not necessarily accurate.) This seems a bit too mechanical to call \"inference\", but as far as I can tell, that's exactly what happens \"under the covers\" if you apply inference rules along with an ontology to a database. Actually, this seems to me to be quite a powerful capability that may help distinguish semantic from relational approaches. It's not that you can't do things like this with relational databases, but I think it's \"harder\". For example, you can easily write a SparQL query to list all the n-level descendents or ancestors of a node in a tree, if you know \"n\". And then you can write another SparQL query to search for all nodes annotated with any member in the list. If you want to find ALL \"descendents\" you actually need to iterate an indeterminate number of times, which SparQL does not support so you have to enter multiple SparQL commands, arrange for inference rules to be applied, or build and navigate a transitive closure on the DAG, as far as I know. Well, regardless of this discussion I like your interface a LOTI think it REALLY simplifies the query process. When I present semantic technologies to normal people, they don't seem too excited about writing SparQL queries (well, ok, they recoil in horror :-), and your interface will provide another, much easier and more palatable, alternative. sothanks again for your work, and I hope you eventually find time to enhance it to support at least the basic \"hierarchy-relative\" queries, :michael grobe uDavid, Richard Not a big mystery if you have a look at the code source of the html page In your example the selected facet Person retrieves only a top list of instances which are cached in the page as an array var values = new Array(\"David (24)\",\"John (19)\" etc And the auto-suggest field taps in this array using javascript new AutoSuggest(\"foaf-name\", values); So it does not query the data base Not really a bug, not really a clean feature either :) Bernard 2009/9/28 Richard Cyganiak < >" "DBpedia 3.2 release - Can we get a moremanageable download? One tar file?" "uMarvin, when we started, we were afraid that people using not such high-scalable rdf stores would struggle if there's just one big rdf dump of dbpedia. that's why we split the dataset into so many little pieces, so that people can choose which parts they really need. in the future, we might provide an additional \"core\" dataset dump in one piece as well. Thanks for the feedback! Cheers, Georgi" "Apple Disambiguation revisited" "uHi, I'm looking into disambiguation data again, see my previous contact about this : www.mail-archive.com/ /msg01456.html As understand it, the extraction framework has now started to include xyz_(disambiguation) pages rather than include the disambiguates relationship directly in any particular resource page. Looking at my favorite example 'Apple_(disambiguation)', in the 3.5 release we see : dbpedia.org/describe/?url= Apple Inc and Apple Records (which were present in the 3.4 release). Looking at the file sizes, I think the 3.5 disambiguations file size is nearly half that of the 3.4 release, have there been significant changes again or is this just a bug ? Cheers, Rob DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" Hi, I'm looking into disambiguation data again, see my previous contact about this : www.mail-archive.com/ /msg01456.html As understand it, the extraction framework has now started to include xyz_(disambiguation) pages rather than include the disambiguates relationship directly in any particular resource page. Looking at my favorite example 'Apple_(disambiguation)', in the 3.5 release we see : dbpedia.org/describe/?url= Apple Inc and Apple Records (which were present in the 3.4 release). Looking at the file sizes, I think the 3.5 disambiguations file size is nearly half that of the 3.4 release, have there been significant changes again or is this just a bug ? Cheers, Rob" "DBpedia ontology - predicate constraints" "uDear DBpedia creators, I am wondering about two issues connected with the properties extracted from the infoboxes and mapped to DBpedia ontology predicates. The first one is the absence in some of the predicates' definitions domains and ranges - e.g. [1], [2]. Shall I assume that I should consult schema.org (which doesn't have the full definitions BTW) in order to check the requirements? The second observation is connected with the fact, that many of the extracted properties are not compatible with the constraints. E.g. [3] Mens (band) has a hometown property [4], which according to its definition [5] has a domain restricted to People. Are there any ongoing efforts to fix these issues? How could I help with them (e.g. by providing the appropriate domain/range definitions)? Kind regards, Aleksander Pohl [1] [2] [3] [4] [5] hometown uHi Aleksander, On 3/19/14, 7:21 PM, wrote: I think this is mainly due to the crowdsourcing nature of the ontology. Contributors may neglect domain/range constraints while adding a new ontology item. Only if there is an equivalence statement between DBpedia and schema.org AND schema.org defines domain/range constraints. I think @Heiko can give you more details, cf. his paper [1]. Cheers! [1] > uHello Aleksander, The wiki is community driven and any action to fix such conflicts is more than welcome. If number of required changes is small we can directly fix it in the wiki otherwise we can discuss ways to automate this. Best, Dimitris Hi Aleksander, On 3/19/14, 7:21 PM, wrote: I think this is mainly due to the crowdsourcing nature of the ontology. Contributors may neglect domain/range constraints while adding a new ontology item. Only if there is an equivalence statement between DBpedia and schema.org AND schema.org defines domain/range constraints. I think @Heiko can give you more details, cf. his paper [1]. Cheers! [1] > uOn Mar 19, 2014 7:24 PM, \" \" < > wrote: from the infoboxes and mapped to DBpedia ontology predicates. domains and ranges - e.g. [1], [2]. Shall I assume that I should consult schema.org (which doesn't have the full definitions BTW) in order to check the requirements? extracted properties are not compatible with the constraints. E.g. [3] Mens (band) has a hometown property [4], which according to its definition [5] has a domain restricted to People. them (e.g. by providing the appropriate domain/range definitions)? Are you aware of the DBpedia mappings wiki? Maybe that's a dumb question. Anyway, once you have a user account, you can add the domain/range here: and fix the domain (maybe change it from Person to its superclass Agent?) here: JC uHi, In my opinion, the main flaw - indeed, I agree completely with Jona - is that no domain is defined for a huge number of properties. So, let us all add a domain definition for a property whenever we see one missing! Regards, Gerard Van: Jona Christopher Sahnwaldt [ ] Verzonden: vrijdag 21 maart 2014 0:11 Aan: CC: dbpedia-discussion Onderwerp: Re: [Dbpedia-discussion] DBpedia ontology - predicate constraints On Mar 19, 2014 7:24 PM, \" \" < > wrote: Are you aware of the DBpedia mappings wiki? Maybe that's a dumb question. Anyway, once you have a user account, you can add the domain/range here: and fix the domain (maybe change it from Person to its superclass Agent?) here: JC uI disagree that this has been only negligence. There are people that believe that constraining the schema is a good thing, and there are people that believe that reusing more freely properties across multiple classes is a good thing. It's a schema driven versus data driven approach. But since the latter group can just ignore the constraints, I don't see harm in doing it. On Mar 21, 2014 1:40 AM, \"Kuys, Gerard\" < > wrote: uOn 24 March 2014 22:57, < > wrote: That's right. We could fix that with a bit of effort. When we build the (in-memory) ontology properties from the mappings wiki, we use the default value owl:Thing if there's no domain [1] or range [2]. That's necessary because domain and range must not be empty. But it also means we don't distinguish between missing values and explicit owl:Thing values. That's the root of the problem. When we write the OWL file, we omit the domain [3] or range [4] if its (internal) value is owl:Thing. Of course, we could write owl:Thing, but that wouldn't solve your problem. I guess we could fix this by adding flags to the OntologyProperty class [5] to remember if there was a domain / range in the mappings wiki source. Not a big deal, but not very elegant either. Two additional fields that are only needed for writing the ontology I said domain and range must not be empty, but that's not the whole story. There are explicit checks if these values are set [6][7], but we allow them to be empty for some namespaces, e.g. schema.org, just not for DBpedia properties. Maybe we could drop that check and allow empty values for domain and range in DBpedia properties. (That would allow us to disinguish between missing values and explicit owl:Thing values and thus would be an alternative fix.) But I don't know what would happen if we allowed that. Maybe everything would be fine, maybe something would crash somewhere. I don't know. JC [1] [2] [3] [4] [5] [6] [7]" "dbpedia infoboxes not in sync" "uHi, I was trying to query some info box data (related to diseases) using DBpedia's SPARQL endpoint. My query is PREFIX dbpedia2: select ?x, ?medline, ?omim, ?icd10, ?diseasedb where { ?x a . ?x ?medline . optional { ?x ?omim .} . optional { ?x ?icd10 .} . optional { ?x dbpedia2:diseasesdb ?diseasedb .} . } While it returns many hits, some of the data seems to be missing, whereas the Wikipedia page has it. For example, for the Rickets entry, the Wikipedia page ( the DBPedia RDF entry for it ( does not list an ICD-10 code. I seem to recall that the DBpedia infobox ontology was the preferred way to query infobox data on DBPedia, but it does not seem to be in sync with what Wikipedia has. Is there a schedule that DBPedia follows to keep in sync with Wikipedia? Or is this just a case of lossy conversion from Wikipedia to DBpedia? uIl 30/12/2010 5.03, Rajarshi Guha ha scritto: Hi, you could try the same query on this endpoint: current Wikipedia. The one you have tried ( Wikipedia dumps dated to March 2010. New dumps should be available soon. Happy New Year, roberto" "Reminder: Linked Data Meetup London, 24 February" "uHi all, a quick reminder for the upcoming Linked Data Meetup in London on 24 February. This meetup is a full-day event with talks and presentations around the latest developments in the Linked Data community, a panel discussion on how we can put Linked Data to work, and workshops on SPARQL, building Linked Data applications, and using Drupal with RDFa. Also we'd like to invite you to host working groups in the afternoon, please get in touch with us. Please find the programme on the meetup page above. Cheers, Georgi" "Inconsistent result sets" "uHi, I have an issue with discrepancies found when querying DBPedia in different ways. For example, select ?s, ?p, ?p1, ?o where { { ?p ?o } UNION { ?s ?p1 } } at dbpedia.org/sparql returns a large number of “sameAs” results, nothing else. The same query at live.dbpedia.org returns nothing. Yet the URL Where are the missing triples at the SPARQL endpoints? thanks, Csaba uTake a look at this thread that contains some info for the reason behind this:  On Monday, July 14, 2014 7:48:55 AM, Csaba Veres < > wrote: uOn 7/14/14 10:30 AM, Csaba Veres wrote: The following endpoints provide you with access to DBpedia datasets, in varying degrees i.e., they don't load identical datasets and their servers are configured differently, due to the fact they are hosted on different machines uThanks for the links about problems with dbpedia live. But why does the regular dbpedia.org/sparql endpoint not return the triples you see with the Best, Csaba u0€ *†H†÷  €0€1 0 + uHi Csaba, it might be an issue with encoding in SPARQL. SELECT ?p ?o WHERE { ?p ?o} is different from SELECT ?p ?o WHERE { ?p ?o} However, I am a bit puzzled that both resources happily co-exist. Best, Heiko Am 14.07.2014 16:30, schrieb Csaba Veres: uOn Mon, Jul 28, 2014 at 4:39 PM, Heiko Paulheim < > wrote: Looks like you found a bug :) Last year we created the sameAs links with a separate process and looks like the English resources were not URI encoded English is the only DBpedia language that still uses URIs instead of IRIs (for compatibility with existing systems) but I think we should change that in the next release" "dump encoding" "uWho can confirm dump text files are 7 bits ascii encoded ? I guess \"linéaire\" is coded \"lin\u00E9aire\" in string and coded \"lin%C3%A9aire\" in uri. Ok for uri coding, it is a standard. but I found nothing for \uxxxx encoding, I also found that \Uxxxxxx is possible. Any more informations ? Best regards Luc Peuvrier \"Alg\u00E8bre lin\u00E9aire\"@fr . DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" Who can confirm dump text files are 7 bits ascii encoded ? I guess \"linéaire\" is coded \" lin\u00E9aire \" in string and coded \" lin%C3%A9aire\" in uri. Ok for uri coding, it is a standard. but I found nothing for \uxxxx encoding, I also found that \Uxxxxxx is possible. Any more informations ? Best regards Luc Peuvrier < < lin\u00E9aire\"@fr < . uHi Luc, Check section 2 & 5 of the spec Cheers, Dimitris Στις 08 Αυγ 2011 3:05 π.μ., ο χρήστης \"luc peuvrier at home\" < > έγραψε: \"lin%C3%A9aire\" in uri. possible. ." "GeoData, specifically: dbpprop:coorDmsProperty" "uHi, I've been looking at dbpedia data for buildings in London, and am wondering how I should interpret the following geodata triples. Specifically, i'm looking at the following file: Presumably, to grab geolocation data, i'd look at the predicate dbpprop:coorDmsProperty. However, the co-ordinates don't appear to be usable, as they are asserted as individual triples, e.g.: dbpprop:coorDmsProperty * N (en) * W (en) * 2 (xsd:integer) * 12 (xsd:integer) * 38 (xsd:integer) * 51 (xsd:integer) * 53 (xsd:integer) * 77 (xsd:integer) Am I missing something here? Thanks, Dan uHi Daniel, forget about the coorDmsProperty, it's a useless output of our generic infobox extraction. Most locations should have a geo:lat [1] and geo:long [2] property, those are the appropriate ones to use. Well, most should(Unfortunately, Temple_Lodges_Abney_Park doesn't have them) Cheers, Georgi [1] [2] wgs84_pos#long" "Federated queries in various DBpedia endpoints" "uHi everyone, I've been trying to run some federated queries with the SERVICE clause on the international endpoint and some local ones (Portuguese, Greek, Spanish). This may be useful to integrate heterogeneous entity descriptions coming from different language chapters. Due to a known issue in Virtuoso [1] (apparently not resolved yet), I found that none of them have enabled those types of queries and result in the following error: Virtuoso 42000 Error SQ200: Must have select privileges on view DB.DBA.SPARQL_SINV_2 Except for the portuguese endpoint, which yields a syntax error: 37000 Error SP030: SPARQL compiler, line 3: syntax error at 'SERVICE' The solution in [1] works fine, the following instructions in ISQL command line are needed: grant select on \"DB.DBA.SPARQL_SINV_2\" to \"SPARQL\"; grant execute on \"DB.DBA.SPARQL_SINV_IMP\" to \"SPARQL\"; Hope this helps. Cheers! [1] virtuoso-federated-query uHi Marco, On 16-May-13 4:26 PM, Marco Fossati wrote: ATM we have disabled SPARQL-FED on DBpedia. You can try from the URIBurner sparql endpoint for instance uHi Rumi, On 5/17/13 12:18 PM, Rumi wrote: Which is the reason for this action? Note however the endpoint in" "inclusion database with dbpedia" "uDear DBpedia team, The Miroslav Institute of Lexicography institution for publishing an encyclopedia in the Croatian language. Some of them are available on the official website I’m interested in include the database of the Miroslav Institute of Lexicography in DBpedia project. So, So, is there a possibility of of linking articles from their official website with DBpedija project (ontology), as it did Freebase? I guess that it would have to do through the owl: sameAs link, but with what tools? I would like to know information on the procedure to be carried to and with what tools? Thank you, Ivana Dear DBpedia team, The Miroslav Institute of Lexicography publishing an encyclopedia in the Croatian language. Some of them are available on the official website DBpedia project. So, So , is there a possibility of of linking articles from their official website with DBpedija project (ontology), as it did Freebase ? I guess that it would have to do through the owl: sameAs link, but with what tools? I would like to know information on the procedure to be carried to and with what tools? Thank you, Ivana uHi Ivana, On 01/16/2013 11:15 PM, Ivana Sarić wrote: You can find more information about the interlinking between DBpedia and the other datasets here [1]. [1] Interlinking" "easy mistakes: using /page URIs in sparql queries" "uSeems a few of us habitually make the same mistakes, so I thought I'd report it here. Not sure exactly what UI improvements to suggest Dan 17:59 melvster: question: anyone know how do i sparql the genus of a peanut ( seemed to have made a mistake18:01 abernier has joined (n= ) 18:02 shellac: what are you using as the subject? 18:03 shellac: ensure it's 18:03 danbri: select ?x where { ?x } 18:03 danbri: works for me in 18:03 melvster: danbri: thanks! 18:03 danbri: welcome 18:04 danbri: just out of curiosity, what did you try? 18:04 danbri like shellac was guessing on a page vs resource mixup 18:04 Wikier has left IRC (Remote closed the connection) 18:05 danbri: .g \"may contain Arachis\" 18:05 phenny: danbri: 18:05 shellac: it's an easy mistake to make (i.e. I've done it :-) 18:05 danbri: :) 18:06 shellac: not sure that redirect is a good idea, although I'm not sure what I'd suggest in its place 18:07 abernier: hi 18:08 shellac has left IRC (\"Ex-Chat\") 18:09 melvster: sorry yes, i had page 18:09 danbri: logger, pointer? 18:09 logger: See 2010-01-20#T17-09-40 uOn 20 Jan 2010, at 17:11, Dan Brickley wrote: Perhaps we need a convention for an icon that you can put on the HTML page and that indicates: “To get the identifier of the thing the page talks about, copy-paste the link behind this icon”. Sort of a “Linked Data Identifier” icon. Similar in idea to the RSS/ Atom feed icon. Richard" "limit and offset switched at lod endpint?" "uHi all, I encountered a strange issue when querying the LOD SPARQL endpoint ( With an offset > 0 it seems that both parameters are switched or something like that. This is an example query showing that behaviour: SELECT ?instance_label FROM WHERE { ?instance rdfs:label ?instance_label . ?instance rdf:type dbpedia-owl:Colour . } ORDER BY ?instance_label LIMIT 1 OFFSET 5 Instead of one result with an offset of five you get 5 results with an offset of one. Switching the values of limit and offset returns the answer you'd expect from the query above. The same query gives a correct result at the dbpedia.org endpoint. Any ideas what's causing this? Regards, Sören uHi, Sören" "Guidelines for Mapping (Questions arised from the mapping marathon)" "uHi Mariano, I don't have answers for everything, but here goes my 2c. (split by subject) MAPPING GUIDE is there any policy for creating DBpedia classes or properties?. For The only guidelines I know are specified here: How do we delete an erroneous mapping? Using the delete tab on the wiki (delete page) If we consider that it is necessary a given property (e.g., debutDate) in See if there is a duplicate. That might be the reason for the deletion. You should then use the one that remained. Otherwise, discuss in the list and the discussion page of that property. Eg: In the statistics of (es) Ficha_de_futbolista we can find the property Yes. We tried creating two mappings, one for \"altura\" and another for Exactly. What do you mean by inconsistency? Why is it a problem? Can we map an infobox to 2 DBpedia classes if both classes are equivalent? We should not have both classes. That is a bug in the ontology and should be fixed. Some properties seem to exist in DBpedia, but when we use them in the This is probably a confusion between infobox property and DBpedia property. The dbprop ( \"infobox property\", while the one that contains the DBpedia properties. You can only map infobox properties to DBpedia Ontology properties. I have a february version of a document entitled \"DBpedia mapping I' m also not sure, but there seems to be a new one. You can check directly in the repository. Folks, anybody else can chip in? Cheers, Pablo uHi, On 11/08/2011 12:16 PM, Pablo Mendes wrote: > Hi Mariano, > I don't have answers for everything, but here goes my 2c. > >> is there any policy for creating DBpedia classes or properties?. For example, we missed the class BullFighter, we checked there was no other similar class, and we created it. > > The only guidelines I know are specified here: > > > How do we delete an erroneous mapping? All wiki pages have a delete tab, but we do not know if it is an immediate delete or it will be checked by any admin > > > AFAIK, it's immediate. In DBpedia-Live, we reprocess all the changed pages we get from Wikipedia update stream, and we also reprocess the pages that are affected by a mapping change. The pages we get from Wikipedia update stream have higher priority, so they are reprocessed first. So, the pages affected by a mapping may take a few minutes to get reprocessed depending on how many live page are waiting for reprocessing, but it will not take long to appear. > > When we create a DBpedia class or property, when it becomes effective?, what is the life cycle of the modifications? > > > AFAIK, it's immediate. What do you mean \"life cycle\"? Changes show up in live.dbpedia.org nearly immediate and on dbpedia.org in the next release (usually twice a year for the entire data & as frequent as you want for your localized version. Same as the previous issue, it may take a few minutes to appear in DBpedia-Live. > > > If we consider that it is necessary a given property (e.g., debutDate) in the DBpedia ontology, but that property was deleted (we can see this in the page history), what do we have to do?. > > > See if there is a duplicate. That might be the reason for the deletion. You should then use the one that remained. Otherwise, discuss in the list and the discussion page of that property. > > is there any way for knowing which username created more mappings? > > > Yes. We do that for the DBpedia Portuguese. See pt.dbpedia.org. I'm glad to share the code. > > It seems that the extraction process reads the properties found in the infobox instances, without checking if those properties are in the infobox definition. is that so? > > > I think so. > > Eg: In the statistics of (es) Ficha_de_futbolista we can find the property \"altura\" as one of the most used, but that property is not in the infobox definition. In the infobox definition we can see \"estatura\" (a concept similar to \"altura\") but is much less used that \"altura\". Do we have a mechanism to map both infobox properties to the same DBpedia property? > > > Yes. > > We tried creating two mappings, one for \"altura\" and another for \"estatura\", > > > Exactly. > > but we get always two triples for each infobox instance (although the instance has only one of these properties). Any solution? > > > What happens if you map only one? Maybe the infobox itself is doing some resolution there? When you say you get two triples, do you mean you get one for the mapped and one for the non-mapped property. > > The parsing of spanish dates (dd/mm/yyyy) does not work (property mapped to xsd:date). Do we have the same problem for decimal numbers? (in spanish, decimal numbers use to be like 2,5 instead of 2.5). > > > You can patch the Date and Decimal extractors to take some i18n config params. I guess that is not too much effort in the parser. > > > Some wikipedia pages have infobox instances with properties that are not in the infobox definition. May be those properties have been deleted from the definition, producing an inconsistency (e.g. (es) Partidos and Ficha_de_montaña). Any recommendation? > > > What do you mean by inconsistency? Why is it a problem? > > What is the meaning of the grey rows in the statistics page? It says \"template is on the ignorelist\". What is this, a \"deprecated\" property/class? > > > The answer is here: > \"the statistics contain non relevant templates like Unreferenced or Rail line. These templates aren't classical infoboxes and shouldn't affect the statistics. On that account they can be ignored. If a template is on the ignore list, it does not count for the number of potential infoboxes.\" > > Can we map an infobox to 2 DBpedia classes if both classes are equivalent? E.g.: Organization and Organisation classes exist in DBpedia. > > > We should not have both classes. That is a bug in the ontology and should be fixed. I had a look on the ontology wiki and found only \"Organisation\" class and did not find the other one you mentioned. > > > In the statistics page (e.g spanish at about the spanish infoboxes sorted by instance number. In the case of spanish, it says there are 1311 different infoboxes, but the table shows only ~300. Where can be find the rest?. The number of properties shown in statistics have a similar issue. For example, in the definition of infobox (es) Ficha_de_futbolista there are 20 properties, but in the infobox statistics (information about the spanish infoboxes sorted by instance number. In the case of spanish, it says there are 1311 different infoboxes, but the table shows only ~300. Where can be find the rest?. The number of properties shown in statistics have a similar issue. For example, in the definition of infobox (es) Ficha_de_futbolista there are 20 properties, but in the infobox statistics ( there are 22. These 2 additional properties come from the infobox instances? > > > I don't know the answer. Paul Kreis is possibly the only one that would know. > > Some properties seem to exist in DBpedia, but when we use them in the mappings are considered nonexistent (are rendered in red). E.g: in (es) Ficha_de_Tenista we tried to use the DBpedia property \"turnedpro\" (in theory existing, as can be seen at at property in our mapping we get \"When we try to use that property in our mapping we get \"Couldn't load property mapping on page en:Mapping es:Ficha de tenista. Details: Ontology property turnedpro not found\". As well we tried with dbpprop:turnedpro, getting the same result. > > > This is probably a confusion between infobox property and DBpedia property. The dbprop (http://dbpedia.org/property) namespace should be read as \"infobox property\", while the http://dbpedia.org/ontology namespace is the one that contains the DBpedia properties. You can only map infobox properties to DBpedia Ontology properties. > > is there any scheduling for the next dump? We are anxious about knowing how many spanish triples we are going to get. > > > Generalized dumps for the entire (Internationalized) DBpedia usually happen twice a year. The international chapters are free to release their data in any release cycle they see fit. So you may just run the extraction framework on your side and tell us how many triples you get. We are also curious! :) > > I have a february version of a document entitled \"DBpedia mapping language\", do you have an actualized version? I found some typos and it does not cover conditional mappings. > > > I also don't know the answer to that question. You can check directly in the repository. http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/cefae9797133/core/doc/mapping_language > > I have a \"big machine\" for hosting the spanish DBpedia, and I hope to set up the extraction process on that machine very soon. Once we get a good spanish extraction process, what do we have to do in order to get the es.dbpedia.org redirect? > > > Whenever the machine is set up, please e-mail dbpedia-developers with the IP and the responsible party will set up the domain forwarding. > > Concerning internationalized resource URIs, we see that the spanish triples generated now in DBpedia have the URI form http://dbpedia.org/Resource/Whatever. Therefore, if we query about the resource http://dbpedia.org/Resource/Berlin, we will get a unique resource with all the properties specified by 15 internationalized versions of wikipedia. Right? However, the \"hosted\" versions of DBpedia (ge, el, ru) have a URI like http://ge.dbpedia.org/Resource/Berlin. Right? > > > There is a current debate about this in the i18n committee. The current solution is to generate the triples under http://es.dbpedia.org/resource/Berlin, and set sameAs links to http://dbpedia.org/resource/Berlin. My preferred solution would be to bypass this step at least in cases where we're more confident that the link is true (for example with bidirectional language links). Feel free to join the discussion: > http://sourceforge.net/mailarchive/forum.php?thread_name=BANLkTin1a9tHUvQb%2B1sMsfuzr8fgUgyQ_Q%40mail.gmail.com&forum;_name=dbpedia-developers > > Folks, anybody else can chip in? > > Cheers, > Pablo >" "Invalid XHTML+RDFa in DBpedia?" "uHi, it looks like the XHTML+RDFa output of DBpedia is not valid, mostly because of unescaped ampersand (&) characters, which causes problems for validating clients: Is this already known and are there any plans to fix this? Best regards Bernhard uHi Bernhard, Thanks for reporting this. We are looking into this issue and shall report back with our findings/fix Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 17 Jan 2010, at 12:46, Bernhard Schandl wrote:" "Alternate language versions of data" "uThis e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\" Alternate language versions of data uHi Tom, the URIs of DBpedia resources are generated from English Wikipedia article names - resource URI for thing, only in different languages, so the data is attached to the same URI. Thus, the DBpedia resource URI for is Or rather, it should beThe current DBpedia data is based on Wikipedia dumps from late September 2009 (the exact dates are given on link between and was only established on October 1: http://en.wikipedia.org/w/index.php?title=Short-beaked_Common_Dolphin&diff;=317238869&oldid;=312847297 http://fr.wikipedia.org/w/index.php?title=Dauphin_commun_%C3%A0_bec_court&diff;=45334637&oldid;=44442487 That's why in this case, the only non-English data is in Catalan. In most other cases, the RDF store at dbpedia.org contains data extracted from the top 20 Wikipedia editions. Regards, Christopher On Tue, Jan 5, 2010 at 12:28, Tom Maslen < > wrote: uChristopher Sahnwaldt wrote: uChristopher, Tom, all uHello, Maybe I don't understand how the SPARQL end points work correctly, but the dbpedia-live.openlinksw.com end point has the same number of languages for the resource as the dbpedia.org end point Is it even possible to get all of the alternative languages for this resource using SPARQL? Thanks, /t uTom Maslen wrote: Yes, as long as the datasets have been loaded. fct" "Paper on using NLP techniques to populate Wikipedia infoboxes with information from article texts" "uHi, I was just scanning the WWW2008 scientific program, which includes a paper that should be relevant to DBpedia: Automatically Refining the Wikipedia Ontology Fei Wu, Daniel S. Weld In the 17th International World Wide Web Conference, (WWW-08), Beijing, China, April, 2008. The author also published a paper about using NLP techniques to automatically populate infoboxes with in information from artile texts. See: Autonomously Semantifying Wikipedia Fei Wu, Daniel S. Weld In the Sixteenth Conference on Information and Knowledge Management (CIKM-07), Lisbon, Portugal, November, 2007. (Best paper prize) Quite interesting :-) Chris uHi, Chris Bizer wrote: btw, the program mis-titled this paper. Under its correct title you can also now find the PDF: Fei Wu, Daniel S. Weld, 2008. Automatically Refining the Wikipedia Infobox Ontology, in the 17th International World Wide Web Conference, (WWW-08), Beijing, China, April, 2008. See Their Kylin Ontology Generator (KOG) has some similarities to YAGO in that WordNet provides the concept graph. U of Washington also has an interesting initiative on Intelligence in Wikipedia. See Mike" "Missing properties (updates) in DBpedia Live" "uHi all, I was just running the query for all object-properties with defined rdfs:domain's and rdfs:range's in DBpedia and DBpedia Live: SELECT * WHERE { ?s rdfs:domain ?domain . ?s rdfs:range ?range . ?s rdf:type owl:ObjectProperty } I noticed that the count of those properties is actually higher in DBpedia ( DBpedia: 431 DBpedia Live: 395 Some of the missing properties are listed below: All of those properties are available at Also, the domain and range specified at http://mappings.dbpedia.org/index.php/OntologyProperty:Ceo is Organisation and Person respectively. But the query: SELECT * WHERE { ?p ?o } @ DBpedia Live still shows SoccerClub and Person. Shouldn't those mappings \"immediately\" be reflected in DBpedia Live? Kind regards, Daniel Hi all, I was just running the query for all object-properties with defined rdfs:domain's and rdfs:range's in DBpedia and DBpedia Live: SELECT * WHERE { ?s rdfs:domain ?domain . ?s rdfs:range ?range . ?s rdf:type owl:ObjectProperty } I noticed that the count of those properties is actually higher in DBpedia ( http://dbpedia.org/sparql ) than in DBpedia Live ( http://live.dbpedia.org/sparql ): DBpedia: 431 DBpedia Live: 395 Some of the missing properties are listed below: http://dbpedia.org/ontology/bandMember http://mappings.dbpedia.org/index.php/OntologyProperty:BandMember http://dbpedia.org/ontology/broadcastNetwork http://mappings.dbpedia.org/index.php/OntologyProperty:BroadcastNetwork http://dbpedia.org/ontology/ IsPartOfMilitaryConflict http://mappings.dbpedia.org/index.php/OntologyProperty:IsPartOfMilitaryConflict http://dbpedia.org/ontology/ isPartOfWineRegion http://mappings.dbpedia.org/index.php/OntologyProperty:IsPartOfWineRegion http://dbpedia.org/ontology/ secretaryGeneral http://mappings.dbpedia.org/index.php/OntologyProperty:SecretaryGeneral http://dbpedia.org/ontology/ senator http://mappings.dbpedia.org/index.php/OntologyProperty:Senator All of those properties are available at http://mappings.dbpedia.org/ , so why are they missing in DBpedia Live? Also, the domain and range specified at http://mappings.dbpedia.org/index.php/OntologyProperty:Ceo is Organisation and Person respectively. But the query: SELECT * WHERE { < http://dbpedia.org/ontology/ceo > ?p ?o } @ DBpedia Live still shows SoccerClub and Person. Shouldn't those mappings \"immediately\" be reflected in DBpedia Live? Kind regards, Daniel uHi, On 10/23/2011 11:43 AM, Gerber Daniel wrote: The problem is fixed now." "interrogating freebase and dbpedia from the same query" "uHello everyone. I wonder whether I can run a sparql query which involves freebase, from dbpedia sparql endpoint, using linked data. For instance, may I retrieve some information about something from dbpedia and other informations about the same thing from freebase, on the same query from the dbpedia sparql end-point? Thank you for your answers. Gratis per te Avatar per Messenger e sfondi per il PC landing2.aspx?culture=it-it ucassio steel wrote: Well, there's one question here: \"do you want the right answers?\" I suppose that somebody could extract RDF statements out of freebase, load them into the same graph as dbpedia, and then use the published dbpedia <-> freebase owl:sameAs statements to map them together. If you did that, AND IF YOU CHECKED THE VALIDITY OF YOUR ANSWERS (which is NOT standard practice in semweb research, so far as I can tell) you'd note that the mappings are horribly wrong. The dbpedia <-> freebase mappings are so bad that they'll substantially change the semantics of dbpedia, even if you never load anything from Freebase; for instance, you can easily infer bizzare and incorrect statements such as dbpedia:Area_11 owl:sameAs dbpedia:Japan. it just isn't so, and results in the inference of millions of extra bogus triples, for instance, dbpedia:Japan rdfs:Label \"Area 11\"@en . I found this out when I was trying to establish a 1-1 relationship between some set of entities and iso codes for countries and second-level administrative division (needed that because 90% of the \"Countries\" in the dbpedia ontology are fictional or not going concerns, i.e. \"Austria-Hungary\", \"The Republic Of Atlantis\", etc.) The ISO codes are in freebase, but the dbpedia <-> freebase mappings were so bad that I decided to give up on Dbpedia as a \"primary reference\", use Freebase and just cherry pick selected information items out of it. To be fair, I do have freebase <-> dbpedia mappings that are much better quality than the ones on the dbpedia site and the time might be right to publish them. The issue here is that there are two ways you can construct said mapping: (i) when fb items are created from wikipedia extraction, the wikipedia page id is recorded in freebase; these can be looked up against the \"page Id\" file from dbpedia. (ii) freebase contains a set of \"keys\" that name items in freebase; some of these keys are in the wikipedia namespace, and those can be mapped to dbpedia pretty easily. The published mappings look a lot like (ii); the keys are really promiscuous and link up things that are at most circumstantially related uCassio, The short answer to your question (as I understood it) is that you could not issue such a query to the dbpedia sparql endpoint by itself. Somehow you would need to get access to an endpoint that contains both the freebase data as RDF and the mappings that Paul discusses here in order to run your query. Please correct me if I am wrong! -Ben uBenjamin Good wrote: Ok, actually it is a bit more complex thant this. Both my and the mapping files contain fbase URIs like If you go to ~that~ URL you get redirected to Now, if you follow a few links you'll eventually find which contains the facts that you probably want about this subject in NT format. So far as I know, however, fbase doesn't offer a dump file with all of the NT assertions about fbase, so anybody who wants to fill an RDF store/SPARQL query system with fbase assertions really has two choices: (i) run a crawler against fbase to harvest said assertions, (ii) or derive these assertions from the \"link export\" file that can be found here I think (ii) would be a straightforward project, particularly if you used GUIDs on the LHS of your assertions; there's some talk about this here: Practically there are all sorts of funky details, such as expanding namespaced keys into blank nodes, _:AyiJNCGP1602 _:AyiJNCGP1602 . _:AyiJNCGP1602 \"4267124\" . Having messed with the \"link export\" file quite a bit I'd say the main issue is that it's an awfully big file and the scripts I run against take a frustratingly long time For a long time I've taken the route of \"efficient data structures & algorithms\" but if I had do a lot more of this I'd be looking at parallelization. Although @fbase is officially trying to submerge GUIDs, both the \"simple topic dump\" and \"link export file\" are highly dependent on GUIDs and as of the last dump files, the \"mid\" identifies that @fbase wants us to use are nowhere to be found. uOn Mon, Jun 21, 2010 at 2:53 PM, Paul Houle < > wrote: A more direct way to do this is just ask for RDF instead of HTML with your first request: curl -L -H 'Accept: application/rdf+xml' which will redirect you to Beware though that there's currently a bug in Freebase which truncates the results to the first 100 triples. For the vast majority of the topics this isn't an issue, but if you receive exactly 100 triples, you should probably assume that you don't have them all. Tom" "how to connect input to wikipedia extraction frame work?" "uHi All, now, i try to operate Wikipedia extraction frame work and, i know that the input to the extraction code is dumped articles from Wikipedia. please i want to ask two questions: first, this dumped articles will be in the form   Database backup dumps or Static HTML dumps? second, how to connect this input data to the code. thanks for your time kind regards, amira Ibrahim abd el-atey uHello, amira Ibrahim abd el-atey schrieb: Database dumps. This page may answer some of your questions: In essence, you need to import the Wikipedia dumps, checkout the DBpedia SVN and run the appropriate extract_*.php file. (It depends on what you want to achieve.) Kind regards, Jens" "DBpedia Lookup: restricting by class?" "uHello, I'm trying to figure out how to restrict results by class when using DBpedia Lookup. The following queries: both return \"USSR State Prize\" as the second result, which is not a Person. Ryan uHey Ryan, class filtering doesn't work. I did put it into the API as parameter, but haven't implemented it yet (although no big deal). Will try to get this done along with some other tweaks over the weekend. Sorry for the inconvenience, I should have added a comment (hmm, to the not yet existing documentation) Cheers, Georgi" "Downloads page languages" "uHello! By which principle are languages chosen for Downloads page at dbpedia.org? I was surprised not to find russian datasets there. It is not very important as datasets for all languages are available at downloads.dbpedia.orgso I'm just curious. Regards, Alexander Hello! By which principle are languages chosen for Downloads page at dbpedia.org ? I was surprised not to find russian datasets there. It is not very important as datasets for all languages are available at downloads.dbpedia.orgso I'm just curious. Regards, Alexander uHi, On Sun, Jun 26, 2011 at 17:31, Alexander Sidorov < > wrote: To list exactly twelve languages under Other Datasets is an arbitrary convention that keeps the table readable. The table lists all languages for which infobox mappings exist (en, de, hu, sl, hr, el). The remaining spots are filled by the languages whose Wikipedias have the most articles [1]. By the time of the extraction, the Dutch Wikipedia was larger than the Russian one. Russian was in fact the first language to drop out of the table. Sorry ;) Cheers, Max [1] wikipedias_html.php" "Virtuoso Open-Source Edition, Version 6.1.1 release" "uHi, OpenLink Software is pleased to announce the official release of Virtuoso Open-Source Edition, Version 6.1.1: New product features as of March 30, 2010, V6.1.1, include: * Database engine - Added wizard-based generation of SQL Tables from CSV imports - Added wizard-based publishing of RDF based Linked Data from CSV files - Added FOAF+SSL login for SQL clients - Added OPTIONS for HTTP server - Added support for setMaxRows in JDBC driver - Added support for JDBC hibernate - Added support for unzip_file () - Added swap guard option - Fixed deadlock retry - Fixed memory leaks - Fixed mtx checks for checkpoint and log write - Fixed X509ClientVerify flag of 0/1/2/3 to accept self-signed or optional certificates - Fixed several issues with JDBC XA support - Fixed use sk_X509_ALGOR_* macros to support OpenSSL 1.0.0 - Fixed wide character when getting procedure columns information. - Fixed remove id from hash before free structure - Fixed IN pred as iterator before index path - Fixed missing initialization in calculation of cost and cardinality - Fixed SQL codegen for NOT() retval expression - Updated documentation * SPARQL and RDF - Added OData cartridge for producing RDF-based Linked Data from OData resource collections - Added CSV cartridge for producing and deploying RDF-based Linked Data from CSV resource types - Added uStream cartridge - Added slidesix cartridge - Added optimization of sprintf_inverse(const) - Added improved version of xsl:for-each-row for both SPARQL and SQL - Added DefaultServiceMap and DefaultServiceStorage - Added immortal IRI for uname_virtrdf_ns_uri_DefaultServiceStorage - Added proper ASK support in web service endpoint - Fixed SPARQL 1.1 compatibility in result set syntax - Fixed incorrect codegen of formatter in ssg_select_known_graphs_codegen - Fixed do not encode default graph - Fixed check if datadump is gz - Fixed detection of n3 and nt formats - Fixed regex to remove default ns from XML - Fixed run microformats independent of rdfa - Fixed bug with UTF-8 encoded strings in box - Fixed allow chunked content to be read as strses - Fixed SERVICE parameter passing for basic Federated SPARQL (SPARQL-FED) - Fixed (!ask()) in filters - Fixed codegen for FILTER (?local = IRI(?:global)) . - Fixed codegen in LIMIT ?:global-variable and OFFSET ?:global-variable - Fixed support for positional and named parameters from exec() or similar in SPARQL, as if they where global variables of other sorts - Fixed rewriting of group patterns with filters replaced with restrictions on equivs - Fixed faster loading of inference sets from single and graph groups - Upgraded native data providers for Jena to version 2.6.2 - Upgraded native data providers for Sesame to version 2.3.1 - Added support for Sesame 2 HTTP repository interface - Added implemented Sesame's Inference Context interfaces (for backward chained reasoning). * ODS Applications - Added profile page improvements covering Favorite Things, GoodRelations-based Offerings (via \"Seeks\" and \"Offers\" UIs) - Added alternative registration and profile management pages (vsp, php, and javascript variants) that work REST-fully with ODS engine - Added X.509 create certificate generation and export to alternative ODS profile management pages (vsp, php, and javascript) - Added a++ option in user's pages - Added updates to Certificate Ontology used by FOAF+SSL - Added support for Google map v3 - Added 'Import' to user pages (vsp, php, etc.) - Fixed Profile Management UI quirks - Fixed SIOC subscriptions - Fixed object properties in favorites - Fixed ontology APIs - Fixed use newer OAT functions - Fixed invitation problem with multiple users - Fixed typo in scovo:dimension - Fixed image preview Other links: Virtuoso Open Source Edition: * Home Page: * Download Page: OpenLink Data Spaces: * Home Page: * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): OpenLink AJAX Toolkit (OAT): * Project Page: * Live Demonstration: * Interactive SPARQL Demo: OpenLink Data Explorer (Firefox extension for RDF browsing): * Home Page: Hi, OpenLink Software is pleased to announce the official release of Virtuoso Open-Source Edition, Version 6.1.1: New product features as of March 30, 2010, V6.1.1, include: * Database engine - Added wizard-based generation of SQL Tables from CSV imports - Added wizard-based publishing of RDF based Linked Data from CSV files - Added FOAF+SSL login for SQL clients - Added OPTIONS for HTTP server - Added support for setMaxRows in JDBC driver - Added support for JDBC hibernate - Added support for unzip_file () - Added swap guard option - Fixed deadlock retry - Fixed memory leaks - Fixed mtx checks for checkpoint and log write - Fixed X509ClientVerify flag of 0/1/2/3 to accept self-signed or optional certificates - Fixed several issues with JDBC XA support - Fixed use sk_X509_ALGOR_* macros to support OpenSSL 1.0.0 - Fixed wide character when getting procedure columns information. - Fixed remove id from hash before free structure - Fixed IN pred as iterator before index path - Fixed missing initialization in calculation of cost and cardinality - Fixed SQL codegen for NOT() retval expression - Updated documentation * SPARQL and RDF - Added OData cartridge for producing RDF-based Linked Data from OData resource collections - Added CSV cartridge for producing and deploying RDF-based Linked Data from CSV resource types - Added uStream cartridge - Added slidesix cartridge - Added optimization of sprintf_inverse(const) - Added improved version of xsl:for-each-row for both SPARQL and SQL - Added DefaultServiceMap and DefaultServiceStorage - Added immortal IRI for uname_virtrdf_ns_uri_DefaultServiceStorage - Added proper ASK support in web service endpoint - Fixed SPARQL 1.1 compatibility in result set syntax - Fixed incorrect codegen of formatter in ssg_select_known_graphs_codegen - Fixed do not encode default graph - Fixed check if datadump is gz - Fixed detection of n3 and nt formats - Fixed regex to remove default ns from XML - Fixed run microformats independent of rdfa - Fixed bug with UTF-8 encoded strings in box - Fixed allow chunked content to be read as strses - Fixed SERVICE parameter passing for basic Federated SPARQL (SPARQL-FED) - Fixed (!ask()) in filters - Fixed codegen for FILTER (?local = IRI(?:global)) . - Fixed codegen in LIMIT ?:global-variable and OFFSET ?:global-variable - Fixed support for positional and named parameters from exec() or similar in SPARQL, as if they where global variables of other sorts - Fixed rewriting of group patterns with filters replaced with restrictions on equivs - Fixed faster loading of inference sets from single and graph groups - Upgraded native data providers for Jena to version 2.6.2 - Upgraded native data providers for Sesame to version 2.3.1 - Added support for Sesame 2 HTTP repository interface - Added implemented Sesame's Inference Context interfaces (for backward chained reasoning). * ODS Applications - Added profile page improvements covering Favorite Things, GoodRelations-based Offerings (via \"Seeks\" and \"Offers\" UIs) - Added alternative registration and profile management pages (vsp, php, and javascript variants) that work REST-fully with ODS engine - Added X.509 create certificate generation and export to alternative ODS profile management pages (vsp, php, and javascript) - Added a++ option in user's pages - Added updates to Certificate Ontology used by FOAF+SSL - Added support for Google map v3 - Added 'Import' to user pages (vsp, php, etc.) - Fixed Profile Management UI quirks - Fixed SIOC subscriptions - Fixed object properties in favorites - Fixed ontology APIs - Fixed use newer OAT functions - Fixed invitation problem with multiple users - Fixed typo in scovo:dimension - Fixed image preview Other links: Virtuoso Open Source Edition: * Home Page: < >" "Fake Conferences CSCI and WORLDCOMP of Hamid Arabnia" "uFake Conferences CSCI and WORLDCOMP of Hamid Arabnia Hamid Arabnia from University of Georgia is well known for his fake WORLDCOMP conferences deceive researchers anymore using his WORLDCOMP. Hamid Arabnia (Guru of Fake Conferences and champion of academic scam) has recently started 2014 International Conference on Computational Science and Computational Intelligence (CSCI'14) researchers further. CSCI'14 is started under the title of “American Council on Science and Education” which is a dummy corporation (does not exist anywhere in the world). Hamid Arabnia buried his name in the list of names of other innocent steering and program committee members of CSCI’14 to avoid any special attention. He knows that if his name is given any special attention then researchers immediately know that the conference is fake due to his “track record” with WORLDCOMP. Hamid Arabnia (the money hungry beast) spoiled the reputations and careers of many authors and committee members involved in his infamous WORLDCOMP for more than a decade and he is now ready to do the same using CSCI. Interestingly, CSCI is scheduled to be held at the same venue where WORLDCOMP was held until 2012. Hamid Arabnia claimed that CSCI proceedings will be published by IEEE but no one knows if IEEE really publishes. Many scholars have already sent emails to IEEE protesting for the unethical behavior of Hamid Arabnia and for the new series of his bogus conferences CSCI. CSCI paper submission deadline will be extended multiple times as usual. Do not spoil your resume by submitting your papers in this bogus conference CSCI which will not be held beyond 2014. Sincerely, Many researchers cheated by Hamid Arabnia conferences" "Cluster could not connect to host" "uDear DBpedians, I recently keep running into that error when issuing (even simple) SPARQL queries: Virtuoso 08C01 Error CL: Cluster could not connect to host 2 22202 error 111 Is there a way to prevent that? Best regards, Heiko uOn 3/12/13 11:38 AM, Heiko Paulheim wrote: This is an issue with the instance. It's an inter-cluster latency issue i.e., sort of like the \"any time\" query feature, if message between nodes aren't received within a set timeframe a fault condition arises. We are working on better treatment of this condition re. what's relayed to clients etc This (like the issue Tim had with IF NOT EXISTS) have arisen from a recent Virtuoso upgrade. We are addressing these matters. uHi, I have the following error: HttpException: 500 SPARQL Request Failed Virtuoso 08C01 Error CL: Cluster could not connect to host 3 22203 error 111 It causes by some maintenance and latest updates. I developed an information extraction system with the aid of DBpedia for MSM2013, the deadline is 20 Mar and unfortunately I can't leverage DBpedia web service now.I just want to know if it resolved before 20 Mar? Hi, I have the following error: HttpException: 500 SPARQL Request Failed Virtuoso 08C01 Error CL: Cluster could not connect to host 3 22203 error 111 It causes by some maintenance and latest updates . I developed an information extraction system with the aid of DBpedia for MSM2013 , the deadline is 20 Mar and unfortunately I can't leverage DBpedia web service now. I just want to know if it resolved before 20 Mar?" "Querying DBpedia with Arabic Triples" "uDear All, I am a PhD student working in the domain of Arabic Semantic Web, and I want please to know: What is the role of DBpedia mappings if we want to interact with DBpedia in Arabic? is it the way to querying the DBpedia in a specific language? If yes, is that means that the only mapped templates will be the available knowledge? I ask that question because the Arabic mapping of DBpedia still very little, so are that mappings the only available knowledge in that language in DBpedia? If someone needs to get information from English Wikipedia in an Arabic interface, is that available? If I have an entities and relationships that are represented in Arabic, how can I relate them to the corresponding entities and predicates or properties in DBpedia? Please doctor help me, this is an urgent problem that I need to resolve. Thank yo so much in advance. Regards, Aya. uHi Aya, On 06/24/2012 04:03 PM, Aya Zoghby wrote: The mappings are used to map one of the Wikipedia attributes of a specific infobox type to a property in the DBpedia ontology. The mappings can be updated through the mappings Wiki available at [1]. For each language there is a section in that Wiki. More information about the mappings Wiki and its benifits can be found the paper titled \"DBpedia and the Live Extraction of Structured Data from Wikipedia\", which you can find at [2]. Actually, the information contained in the Arabic Wikipedia can be extracted and converted to RDF using the current DBpedia framework, but using the mappings through the extraction process gives the framework more fine grained control on the extracted data. You can understand what I mean if you read the aforementioned paper. That's right, so one of the benefits of using a Wiki for the mappings, is that it enables the user to participate. So you are very welcome to register yourself in the mappings Wiki, and add more mappings for the Arabic language, which enriches the knowledge of DBpedia in general. Please also note that after you subscribe to mappings Wiki, you should send a mail asking for editor rights, in order to be able to modify the mappings. Can you please clarify that question? If you understands that question correctly, you can do that through the mappings Wiki. If you have any further questions, please don't hesitate to contact us back. [1] [2] program_el_dbpedia_live.pdf uDear Aya, please have a look at the ongoing internationalization of DBpedia: You are welcome to join in. The related publications for internationalization is this and it will answer most of your questions: *Dimitris Kontokostas, Charalampos Bratsas, Sören Auer, Sebastian Hellmann, Ioannis Antoniou und George Metakides: * /Internationalization of Linked Data: The case of the Greek DBpedia edition/ In: Web Semantics: Science, Services and Agents on the World Wide Web Link: All the best, Sebastian On 06/24/2012 04:03 PM, Aya Zoghby wrote:" "Service Temporarily Unavailable" "u503 Service Temporarily Unavailablenginx/0.8.37 Hello all , when i am trying to execute sparql query on the site , i always have the above result , did anyone knows the reason , The link iam using is : note : i hope that the reason is \" I am calling it from Syria \", since the information should be for all of us. regards,Ghassan uHi Ghassan, Please try again, there was a problem with the server which has been resolved Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 12 Nov 2010, at 13:22, Ghassan Alhamoud wrote: uHello Williams , thanks for your response , I had a problem on this link The error message is : The remote name could not be resolved: 'dbpedia.org' the link were running normally not now it is not. Regards,Ghassan CC: From: Subject: Re: [Dbpedia-discussion] Service Temporarily Unavailable Date: Fri, 12 Nov 2010 13:55:47 +0000 To: Hi Ghassan, Please try again, there was a problem with the server which has been resolved Best RegardsHugh WilliamsProfessional ServicesOpenLink SoftwareWeb: On 12 Nov 2010, at 13:22, Ghassan Alhamoud wrote:503 Service Temporarily Unavailablenginx/0.8.37 Hello all , when i am trying to execute sparql query on the site , i always have the above result , did anyone knows the reason , The link iam using is : note : i hope that the reason is \" I am calling it from Syria \", since the information should be for all of us. regards,Ghassan uHi Ghassan, Your page seems to be running the following query: Which runs fine for me from my location, thus is would appear the machine hosting that service cannot resolve the \"dbpedia.org\" domain name. Have you tried querying the dbpedia.org sparql endpoint yourself from the machine ie Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 12 Nov 2010, at 15:02, Ghassan Alhamoud wrote: uHi , It is public hosting and i could not access command line ,before i use the free hosting , i tried commercial one this is the link : do you think it security issue , and is there any workaround if yes. Subject: Re: [Dbpedia-discussion] Service Temporarily Unavailable From: Date: Fri, 12 Nov 2010 17:54:20 +0000 To: Hi Ghassan, Do you have the \"curl\" command on the machine hosting your application such that you can attempt to run this query from command line to see what it returns. If you do simply run the following and report what is returned: curl \" Best RegardsHugh WilliamsProfessional ServicesOpenLink SoftwareWeb: On 12 Nov 2010, at 17:21, Ghassan Alhamoud wrote:hi , First i build the query on dbpedia , then i put it on the page to get the result , but suddenly this page today could not be displayed . CC: From: Subject: Re: [Dbpedia-discussion] Service Temporarily Unavailable Date: Fri, 12 Nov 2010 15:45:10 +0000 To: Hi Ghassan, Your page seems to be running the following query: Which runs fine for me from my location, thus is would appear the machine hosting that service cannot resolve the \"dbpedia.org\" domain name. Have you tried querying the dbpedia.orgsparql endpoint yourself from the machine ie Best RegardsHugh WilliamsProfessional ServicesOpenLink SoftwareWeb: On 12 Nov 2010, at 15:02, Ghassan Alhamoud wrote:Hello Williams , thanks for your response , I had a problem on this link The error message is : The remote name could not be resolved: 'dbpedia.org' the link were running normally not now it is not. Regards,Ghassan CC: From: Subject: Re: [Dbpedia-discussion] Service Temporarily Unavailable Date: Fri, 12 Nov 2010 13:55:47 +0000 To: Hi Ghassan, Please try again, there was a problem with the server which has been resolved Best RegardsHugh WilliamsProfessional ServicesOpenLink SoftwareWeb: On 12 Nov 2010, at 13:22, Ghassan Alhamoud wrote:503 Service Temporarily Unavailablenginx/0.8.37 Hello all , when i am trying to execute sparql query on the site , i always have the above result , did anyone knows the reason , The link iam using is :http://dbpedia.org/sparql?default-graph-uri=http://hammoud-project.net/awm/rdf.aspx&should-sponge;=grab-all-seealso&query;=PREFIX+svu:+%3Chttp://hammoud-project.net/awm/rdfs2.aspx%23%3E%0D%0Aselect+%3Fd+%0D%0Afrom+%3Chttp://hammoud-project.net/awm/rdf.aspx%3E%0D%0A+where+{%0D%0A++++++++%3Fa+svu:name+%3Fd+%0D%0A}&debug;=on&timeout;=&format;=text/html note : i hope that the reason is \" I am calling it from Syria \", since the information should be for all of us. regards,Ghassan uHi Ghassan, You should check with your ISP, the DNS records for dbpedia.org are resolved fine from my end too. Best Regards, Mitko On Nov 12, 2010, at 8:15 PM, Ghassan Alhamoud wrote:" "Issues getting extraction framework to run on Windows" "uHello, Just today I have been attempting to get the latest version of the DBpedia extraction framework running on my computer (Windows Vista 32-bit). I've done a clean (full) install of PHP 5.3.0 to \"C:\Program Files\PHP\" (default directory) and the checked out the SVN repository in its entirety. To get the framework running, the Documentation page on the wiki ( the command line. However, this is where I start running into problems. Executing \"php start.php\" (inside cmd.exe) from the \"extraction\" directory returns the following: PHP Warning: require_once(ExtractionJob.php): failed to open stream: No such fi le or directory in C:\Users\Alex\Documents\DBpedia\extraction\start.php on line 96 PHP Fatal error: require_once(): Failed opening required 'ExtractionJob.php' (i nclude_path='.;C:\php5\pear') in C:\Users\Alex\Documents\DBpedia\extraction\star t.php on line 96 I proceeded by trying to run extract_test.php in the same way, but also receive an error here: class ValidateExtractionResult { public static function validate($result, $extractor){ $triples = $result- getTriples(); $md = $extractor->getMetadata(); $produces = $md[PRODUCES]; foreach ($triples as $triple){ $checked = false; foreach ($produces as $rule){ if($rule['type']==STARTSWITH && Rule::checkStartsWith($rule, $triple)){ $checked = true; break; } } if($checked!= true){ print_r($produces); die(\"Extractor: \".get_class($ext ractor).\" has wrong settings in metadata['produces'] at \n\".$triple->toString()) ; } } } private static function log($message){ Logger::logComponent('core', \"validateextraction result\", DEBUG ,$message); } } PHP Fatal error: Class 'ValidateExtractionResult' not found in C:\Users\Alex\Do cuments\DBpedia\extraction\extractors\ExtractorContainer.php on line 32 installation of PHP 5.x (5.3 in my case) and the entire source for the extraction framework ought to work \"out of the box\". However, this is far from the case for me, and I have been rather stumped in trying to resolve things. As a slight side point, I should note that I did have to make a few minor modifications to my environment after the clean installation of PHP. I was receiving message boxes complaining that the modules libglib-2.0-0.dll and OCI.dll could not be found, but this was resolved by copying the bin dir of the GTK+ bundle for Windows into the root PHP dir (for the former DLL) and commenting out the following lines in php.ini (for the latter DLL): ;[PHP_OCI8] ;extension=php_oci8.dll ;[PHP_OCI8_11G] ;extension=php_oci8_11g.dll ;[PHP_PDO_OCI] ;extension=php_pdo_oci.dll However, judging by the presumed resolution of these issues and the details of the error messages I am no receiving, this likely has nothing to do with my current issues. Simple PHP scripts run fine on my computer (at least those served from Apache). Any assistance in figuring out the causes and solutions of these unexpected errors would be much appreciated. Regards, Alex uHello, Alex schrieb: We just tried extract_test.php on Windows. There was a small error ( uHI Jens, No more error messages after the update, so we've made some progress at least. Thanks for the speedy fix. However, running extract_test.php does not seem to actually be outputting the RDF triples. The following is what I get running the script from command prompt. C:\Users\Alex\Documents\DBpedia\extraction>php extract_test.php \"London\"@en . The output takes about two seconds to appear, followed by the program immediately terminating. As I understand, extract_test.php should be using SimpleDumpDestination and thus printing directly to stdout. Any ideas please? Regards, Alex uHello, Alex schrieb: Doesn't it print to stdout? Reading your message, it appears that one triple (in N-Triples format) was extracted and print to stdout. extract_test.php downloads the page specified in extract.php from Wikipedia (which explains the delay). In this case, it is the article about \"London\". It then runs the extractor specified in extract_test.php on this article (by default \"SampleExtractor\"). The result is printed to stdout. As mentioned previously, extract_full.php should be used for producing a complete DBpedia release (but you need to use import.php to download the corresponding Wikipedia dumps before and import them in MySQL databases). Kind regards, Jens uHi, That was just me being silly, it seems. I failed to notice that extract_test.php uses SampleExtractor, and that SamplExtractor is only meant to print out one triple. Changing it to use InfoboxExtractor proves that the framework is indeed working. There is one other point to note, howeverThe databaseconfig.php file is now missing from the SVN repository. I did however have an old version of the file in my directory (not sure which revision), which I used to run extract_dataset.php (this script requires the databaseconfig file). I receive these errors however: C:\Users\Alex\Documents\DBpeida\extraction>php extract_dataset.php PHP Warning: mysql_fetch_assoc(): supplied argument is not a valid MySQL result resource in C:\Users\Alex\Documents\DBpeida\extraction\iterators\AllArticlesSql Iterator.php on line 55 PHP Warning: mysql_fetch_assoc(): supplied argument is not a valid MySQL result resource in C:\Users\Alex\Documents\DBpeida\extraction\iterators\AllArticlesSql Iterator.php on line 55 Without the databaseconfig.php file in place, I receive the following errors: C:\Users\Alex\Documents\DBpeida\extraction>php extract_dataset.php PHP Warning: include(databaseconfig.php): failed to open stream: No such file o r directory in C:\Users\Alex\Documents\DBpeida\extraction\iterators\AllArticlesS qlIterator.php on line 19 PHP Warning: include(): Failed opening 'databaseconfig.php' for inclusion (incl ude_path='.;C:\php5\pear') in C:\Users\Alex\Documents\DBpeida\extraction\iterato rs\AllArticlesSqlIterator.php on line 19 PHP Notice: Undefined variable: dbprefix in C:\Users\Alex\Documents\DBpeida\ext raction\iterators\AllArticlesSqlIterator.php on line 21 PHP Notice: Undefined variable: host in C:\Users\Alex\Documents\DBpeida\extract ion\iterators\AllArticlesSqlIterator.php on line 23 PHP Notice: Undefined variable: user in C:\Users\Alex\Documents\DBpeida\extract ion\iterators\AllArticlesSqlIterator.php on line 23 PHP Notice: Undefined variable: password in C:\Users\Alex\Documents\DBpeida\ext raction\iterators\AllArticlesSqlIterator.php on line 23 PHP Warning: mysql_connect(): Access denied for user 'ODBC'@'localhost' (using password: NO) in C:\Users\Alex\Documents\DBpeida\extraction\iterators\AllArticle sSqlIterator.php on line 23 Keine Verbindung m´┐¢glich: Access denied for user 'ODBC'@'localhost' (using pas sword: NO) Is the extract_dataset.php file now deprecated too? I can't seem to figure out its purpose. (Unfortunately, the Documentation page on the wiki is still very incomplete, as I'm sure you know.) If not, do these errors mean anything to you? Thanks for your assistance. Regards, Alex" "A modeling issue?" "uHi, I was trying to create a query to retrieve nuclear power plants from dbpedia. My query was like this. SELECT * where { ?concept skos:broader . ?subject skos:broader ?concept. ?plant skos:subject ?subject. ?plant a ?type. ?type rdfs:subClassOf yago-class:Station104306080. } I did another query for airports where I was able to use rdf:type of . I thought I would be able to use rdfs:type for my query of nuclear power stations, but nuclear power stations didn't seem to use that concept - not sure why. From the results of my query I quickly noticed that San Onofre Nuclear power plant, among others was not in the results. I looked up San Onofre at this address It is categorized as a dbpedia:Category:Domes nd dbpedia:Category:Energy_resource_facilities_in_California . since dbpedia:Category:Nuclear_power_stations skos:broader dbpedia:Category:Domes I thought maybe the connection there would let me get San Onofre, but it seems it would include a lot of other Domes that are not dbpedia:Category:Nuclear_power_stations . However I also saw this Nuclear Power station (that didn't come up in my results) dbpedia:Rancho_Seco_Nuclear_Generating_Station which is not a dome, so that doesn't really help. Bottom line after I got my query for airports together I thought it wouldn't be a big deal to find other things categorically. I get the feeling from working this example that is not the case. Can you see a better way to get the nuclear power stations? Is this problem due to the way dbpedia information is modeled? Do you think this data will get better? Thanks Brian uHi Brian, two reasons you can't find your powerplant in DBpedia in the same way you found airports: 1. has no infobox. Compare e.g. to the wiki markup by clicking on \"edit this page\". There you see the \"infobox NPP\". We need those infoboxes to extract the main structured data for a resource and to assign a class in the dbpedia ontology. 2. We don't have nuclear power plants in the dbpedia ontology yet. As I wrote in my email to the mailinglist an hour ago, the ontology and the mappings were defined manually, and we just started with the classes/infoboxes that seems most relevant to us, since that's a lot of work. So even the Limerick Nuclear Power Plant won't show up in the ontology dataset at the moment. And we don't have a user interface for the community to help with the mappings yet. If you want to help, you can \"design\" the power plant class and all according properties for the dbpedia ontology (see infoboxes to ontology properties. I will include them in our mapping-repository and extract your data. Cheers, Georgi uOn 2/7/09 1:01 PM, Brian Hardy wrote: If we could collate a list of queries like these somewhere, myself and others can use them as fodder for a variety of demos. The more self description from the \"beholder\" pool the easier it is for providers of \"beauty\" :-) If possible, could we have a Wiki entries for: Things I would like to find easily using DBpedia and/or LOD cloud. At least on our part (OpenLink) we can make a list of SPARQL Queries, Report Pages (persisted Faceted Browser Pages), and other goodies that make life easier. As I've said: \"we are shrinking the Report Writer realm of yore via the Web as a DBMS and URIs\" . Kingsley" "Bad Turtle, No Cookie" "uI've been trying to process DBpedia Live with a pipeline that uses Jena and I've found 8765 triples that Jena won't parse from The rejected triples can be found here: Several sorts of problem turn up, I'm sure that most of them are problems on the DBpedia side, such as the use of URLs that contain \u escapes, both in the subject and object fields, but also in the literal type field. I'm not so sure about the use of \U escapes in labels in N-Triples, where there seems to be some confusion about how to handle Unicode characters." "FW: Query time when using POST vs GET" "u u0€ *†H†÷  €0€1 0 +" "DBpedia 3.8 Abstract extraction problem" "uHello, I need to extract wikipedia articles for a project I'm working on. Everthing seems to go smoothly, until the Abstract Extractor starts. Then I get tons of errors like these two : sept. 06, 2012 1:15:53 PM org.dbpedia.extraction.mappings.AbstractExtractor$$anonfun$retrievePage$1 apply$mcVI$sp INFO: Error retrieving abstract of title=Andre Agassi;ns=0/Main/;language:wiki=en,locale=en. Retryingjava.net.ConnectException: Connexion refusée at org.dbpedia.extraction.mappings.AbstractExtractor$$anonfun$retrievePage$1.apply$mcVI$sp(AbstractExtractor.scala:118) sept. 06, 2012 1:15:53 PM org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1 apply WARNING: error processing page 'title=Austro-Asiatic languages;ns=0/Main/;language:wiki=en,locale=en' java.lang.Exception: Could not retrieve abstract for page: title=Austro-Asiatic languages;ns=0/Main/;language:wiki=en,locale=en at org.dbpedia.extraction.mappings.AbstractExtractor.retrievePage(AbstractExtractor.scala:134) Any idea how to fix this problem ? Thank you ! Olivier Sollier DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' BODY{font:10pt Tahoma,Verdana,sans-serif} .MsoNormal{line-height:120%;margin:0} Hello, I need to extract wikipedia articles for a project I'm working on. Everthing seems to go smoothly, until the Abstract Extractor starts. Then I get tons of errors like these two : sept. 06, 2012 1:15:53 PM org.dbpedia.extraction.mappings.AbstractExtractor$$anonfun$retrievePage$1 apply$mcVI$sp INFO: Error retrieving abstract of title=Andre Agassi;ns=0/Main/;language:wiki=en,locale=en. Retryingjava.net.ConnectException: Connexion refusée at org.dbpedia.extraction.mappings.AbstractExtractor$$anonfun$retrievePage$1.apply$mcVI$sp(AbstractExtractor.scala:118) sept. 06, 2012 1:15:53 PM org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1 apply WARNING: error processing page 'title=Austro-Asiatic languages;ns=0/Main/;language:wiki=en,locale=en' java.lang.Exception: Could not retrieve abstract for page: title=Austro-Asiatic languages;ns=0/Main/;language:wiki=en,locale=en at org.dbpedia.extraction.mappings.AbstractExtractor.retrievePage(AbstractExtractor.scala:134) Any idea how to fix this problem ? Thank you ! Olivier Sollier uHi Olivier, On 09/06/2012 07:32 PM, Sollier Olivier wrote: The abstract extractor requires the installation of a local mirror of Wikipedia, at least for the required language, in order for it to be able to resolve templates. A template is a simple sequence of characters that has a special meaning for Wikipedia, and those templates are handled in special way. Ex: {{convert|1010000|km2|sp=us}} This templates tells Wikipedia that the area of some country is 1010000 square kilometers, and when it is rendered, Wikipedia should display its area in both square kilometers, and square miles. So, Wikipedia will render it as “1,010,000 square kilometers (390,000 sq mi)”. So, the abstract extractor cannot work correctly without this local Wikipedia. More details about the abstract extraction can be found in [1]. [1] program_el_dbpedia_live.pdf" "GSoC 2013 DBpedia + Spotlight joint proposal (please contribute within the next days)" "uDear lists, Dimitris Kontokostas has started to draft a document for submission at Google Summer of Code: We are still in need of ideas and mentors. If you have any improvements on DBpedia or DBpedia Spotlight that you would like to have done, please submit it in the ideas section now. Note that accepted GSoC students will receive about 5000 USD, which can help you to estimate the effort and size of proposed ideas. It is also ok to extend/amend exisiting ideas (as long as you don't hi-jack them) Becoming a mentor is also a very good way to get involved with DBpedia. As a mentor you will also be able to vote on proposals, after Google accepts our project. Note that it is also ok, if you are a researcher and have a suitable student to submit an idea and become mentor. After acceptance by Google the student then has to apply for the idea and get accepted. Please take some time this week to add your ideas and apply as a mentor, if applicable. Feel free to improve the introduction as well and comment on the rest of the document. Information on GSoC in general can be found here: Thank you for your help, Sebastian uDear all, Today evening we will move the Google Document to the wiki. This is is a kind reminder to add any contributions before that for easier collaboration. Kind regards, Dimitris Kontokostas On Wed, Mar 20, 2013 at 11:49 AM, Sebastian Hellmann < > wrote: uOn 20 March 2013 09:49, Sebastian Hellmann < > wrote: I meant to reply to this, but must have hit 'Discard' instead of 'Send'. Money is not a good gauge of the scope of the projects - $5000 is worth a lot less in Ireland than it is in Germany, a lot less in Germany than in Russia, etc. The two things to bear in mind are that 1) this competition is open to _all_ students registered to a third level institution - from first year to PhD; and 2) the projects are intended to take 3 months, with 1 month of 'community bonding', where the student should take care of any background research that may be needed for the project. The ideas don't need to be fully formed - though if there are details, it's nice to provide them - they just need to beideas. There is almost a month between the publication of the list of organisations and the deadline for proposals, which can be used to refine the ideas - at Apertium, we have students discussing ideas with us *now*, even though there is no guarantee that we will be accepted. Also, the project ideas are intended to be a guide: the students are free to - and should be encouraged to! - submit their own ideas. They may additionally - if they can't find a suitable organisation - apply to the Google Open Source Office, provided that they can suggest a suitable mentor, though this is typically an option only available for more research-oriented projects. uHi Jimmy, thanks for your tips! I added/extended two ideas yesterday. I ended up at six to eight paragraphs with 400 to 500 words. Do you think that's too long? The 2012 ideas I looked at were shorter. Cheers, JC On 27 March 2013 13:11, Jimmy O'Regan < > wrote: uOn 27 March 2013 15:51, Jona Christopher Sahnwaldt < > wrote: What I intended to say didn't come out quite as I meant :) uOn 27 March 2013 17:47, Jimmy O'Regan < > wrote: uOn 27 March 2013 21:47, Jona Christopher Sahnwaldt < > wrote: uPlease enable commenting on the Google Doc. I have spotted typos \" Identifiers and data provided by DBpedia was greatly involved in creating this knowledge graph.\" -> identifiers and data were I also have comments about ideas: \"Mapping service from Wikidata properties to DBpedia ontology\" -> I am not sure that syntax is nice. Instead of using the mappings wiki, with GSoC money, in 3 months, we could probably do something better. More like the mapping tool, or like WikiData itself does. I don't see why we need that ridiculous wiki syntax to be any more in the middle of our lives. A simple database-backed application would do it. All we need is to associate identifiers from once source to identifiers of another source. There was a nice thread at the WikiData list from which we could extract a couple of nice things for our GSoC2013 proposal: 1) BBC is effectively using DBpedia to solve very real world problems. This should be in our intro. 2) - Our identification mechanism needs work - Our simplistic typing heuristics via Infobox needs work These two things could be *part* of GSoC 2013 projects. Cheers, Pablo On Wed, Mar 27, 2013 at 10:51 PM, Jimmy O'Regan < > wrote: uHi Pablo, I disabled the editing of GDoc, we will use the wiki page from now on wiki.dbpedia.org/gsoc2013/ideas Cheers, Dimitris On Thu, Mar 28, 2013 at 11:48 AM, Pablo N. Mendes < >wrote: uI noticed. I assume that from now on, comments go via e-mail and final changes should be added directly to the wiki? Cheers, Pablo On Thu, Mar 28, 2013 at 10:54 AM, Dimitris Kontokostas < >wrote: uI think it is better this way. DBpedia.org has a very strange markdown / wiki format and it took me too long to convert the gdoc. BTW, I also removed some text from JC's ideas because they were breaking the formatting Cheers, Dimitris On Thu, Mar 28, 2013 at 12:06 PM, Pablo N. Mendes < >wrote: uAll links are missing. I think they are important. On Thu, Mar 28, 2013 at 11:29 AM, Dimitris Kontokostas < >wrote: uHi Dimitris, I've just noticed there's an highlighted 'missing ref' in the 'ontology consistency check' section. The references are: A. Pohl at WoLE 2012 A. Gangemi at ISWC 2012 A. Aprosio at ESWC 2013 Could you please add them? Thank you in advance. Cheers! Marco On 3/28/13 11:40 AM, Pablo N. Mendes wrote:" "Update" "uHello everybody, For all of you interested in DBpedia here some updates, what's going on behind the scenes: * Piet and Georgi are currently in the process of restructuring and deploying the extraction code, we are facing a number of smaller and larger problems here (e.g. the Wikipedia dumps were growing above 2GB and some tools seem to have problems handling files of this size) * depending on their success, we hope to release new data-sets within the next week or so, we hope most of the problems which were brought up by you are solved, however we will still rely on your feedback for QA * at Universität Leipzig we got a brand new multi-core server for DBpedia, where we will migrate some demos to over the summer * since DBpedia so far was rather a side project, we are at the moment in the process of acquiring dedicated funding to refine DBpedia and to intensify work on it - let us know, if you have some ideas with this regard or want to help out proposal writing ;-) That's all for today, have a nice weekend everybody Sören" "DBPedia: Marcos_Escobedo" "uHi,   The following resource in DBPedia: is a Person.   which has a country: , but in fact that is also a Person.   How is this possible? Hi, The following resource in DBPedia: < uHi On 02/11/2013 12:01 PM, Vishal Sinha wrote: this is because in the mappings of \"Infobox military person\" [1], attribute \" allegiance\" is mapped to property \"country\" in the DBpedia ontology. Hope that clarifies the issue. [1] index.php?title=Mapping_en:Infobox_military_person uHi Vishal, it is not a matter of mappings, it is a matter of wikipedia. If you edit the wikipedia page \"Marcos Escobedo\" you will see in the infobox: |allegiance=[[Miguel Hidalgo]] this means a link to the wikipedia page \"Miguel Hidalgo\", which is a desambiguation page ( a UNIQUE value: \"Miguel Hidalgo y Costilla\". A solution could be to change the link to the \"State\" Miguel Hidalgo ( Miguel Hidalgo. Best regards, Saludos cordiales, -Mariano On Mon, Feb 11, 2013 at 12:01 PM, Vishal Sinha < >wrote: uHi, the DBpedia mapping seems correct. *allegiance* – *optional* – the country or other power the person served. Maybe a more appropriate dbpedia-owl property could be used for this infobox property, e.g. [2] I am going to change the mapping, but it is also needed to modify the Wikipedia article in my opinion. Regards [1] [2] 2013/2/11 Mariano Rico < > uHi Andrea and Mariano, On 02/11/2013 01:42 PM, Andrea Di Menna wrote: I did not mean that the mapping is incorrect, but he was asking why property \"country\" refers to , and I'm just explaining why it is like this. uHi Paul,   You can just type following in your browser:   Now, you can see: against the property:   dbpedia-owl:country   I am just browsing the existing data on DBPedia, I am not sure whethere that happens due to reasoning or that is the data without reasoning.   You mentioned that: \"The ontology is best used just as a loose reference guide, not to reason on : )\"   I have serious doubts on your this statement.   Thanks.   From: Paul Wilton < > To: Vishal Sinha < > Cc: David Wood < >; \" \" < >; \" \" < > Sent: Monday, February 11, 2013 7:24 PM Subject: Re: DBPedia: Hi Vishalif you are loading the instance data and the dbpedia ontology into a triple store that has RDFS reasoning capability then its because the ontology properties that are typically defined with domains and ranges are misused throughout the dbpedia instance data - which causes instances to be inferred to be the wrong class types. This is prevalent throughout - its a bit of a mess to say the least. The ontology is best used just as a loose reference guide, not to reason on : ) Paul On Mon, Feb 11, 2013 at 12:56 PM, David Wood < > wrote: Because real data is dirty. uHi, this is one example of the limits in producing an ontology without considering the full range of data that it has to provide a schema for. See e.g. [1] for an approach of deriving \"ontology patterns\" from data, in order to optimize the semantics of the ontology. In DBpedia, one may want to establish one or more thresholds (even contextual or dynamic) for including alternative types in the domain/range of properties. Vishal's example spots the optimism of mapping dbpprop:allegiance directly to dbpedia-owl:country, while dbpprop:allegiance can range on the following types: i.e. about 855 out of 9660 (almost 9%) are not places (let alone countries, which are 8336, i.e. 86%). That would make a 14% error rate on the declared range, quite a lot. The approach described in [1] would lead to a different mapping: dbpedia-owl:allegiance: domain: dbpedia-owl:Agent range: dbpedia-owl:Agent owl:union dbpedia-owl:EthnicGroup owl:union dbpedia-owl:Event owl:union dbpedia-owl:Place Best Aldo [1] Presutti V., Aroyo L., Adamou A., Schopman B., Gangemi A., Schreiber G. Extracting Core Knowledge from Linked Data. Proceedings of the Second Workshop on Consuming Linked Data, COLD2011, Workshop in conjunction with the 10th International Semantic Web Conference 2011 (ISWC 2011), On Feb 11, 2013, at 7:16:14 PM , Jona Christopher Sahnwaldt < > wrote:" "Spotlight question" "uHello, As I understand, I can use Spotlight for text annotation using DBPedia ontology? Is it possible to use my ontology? Or, what should I do to include instances from my ontology to DBPedia? Best, Srecko Hello, As I understand, I can use Spotlight for text annotation using DBPedia ontology? Is it possible to use my ontology? Or, what should I do to include instances from my ontology to DBPedia? Best, Srecko uSrecko, This is the DBpedia mailing list. Although DBpedia Spotlight shares a family name with the DBpedia Extraction Framework and the DBpedia Dataset, we are a separate project (which benefits from the clever work done by the rest of the DBpedia family). The message might be marginally interesting to the people in here, but I think it's more appropriate to move it to the dbp-spotlight-users mailing list (CCed in this message). Please redirect the rest of this thread there. Yes, you can use your own ontology in two ways. First, you can use it to select which kinds of *DBpedia resources* to include in the annotations. For that you only need to have mappings from your ontology to DBpedia and load all the data into a SPARQL endpoint that will then be used when giving a SPARQL query to DBpedia Spotlight. Second, you can teach DBpedia Spotlight to recognize instances of your own ontology (instead of only DBpedia Resources). At the moment you would also require a corpus containing annotations of instances of such ontology, or mappings to DBpedia (for which we have Wikipedia as a corpus). We need these text examples in order to \"teach\" DBpedia Spotlight to recognize your entities once a new text is presented to the system. More help is available here: Does this answer your question? Cheers, Pablo On Thu, Aug 4, 2011 at 1:06 PM, srecko joksimovic < > wrote: uPablo, Thank you very much for such a brief answer. I will look documentation in more details, but I needed this information so I could be sure that I am going into the right direction. Best, Srecko On Thu, Aug 4, 2011 at 3:04 PM, Pablo Mendes < > wrote:" "Lucene Index and Disambiguation" "uHi, Someone has already indexed data from DBpedia with Lucene? I currently have a local repository with the following loadings: labels, types and short abstract. I wanted to index all labels DBpedia and use the short abstract for disambiguation. For disambiguation what better way to do this? How can I catch the short abstract words more relavantes related to a particular resource? Thanks in advance, David. Hi, Someone has already indexed data from DBpedia with Lucene? I currently have a local repository with the following loadings: labels, types and short abstract. I wanted to index all labels DBpedia and use the short abstract for disambiguation. For disambiguation what better way to do this? How can I catch the short abstract words more relavantes related to a particular resource? Thanks in advance, David. uHi David, Please take a look at the DBpedia Spotlight project on Github. We did exactly that (and much more). Cheers, Pablo On Mon, Jun 10, 2013 at 12:04 PM, David Miranda < >wrote:" "help with a sparql query" "uI was wondering if I could get some help/advice with a sparql query I'm running against dbpedia. I've got this query to get a list of US publishers and their abstract which returns nicely [1]: SELECT ?publisher ?abstract WHERE { ?publisher . ?publisher ?abstract . } But if I add a match for the rdfs:label I get no results [2] SELECT ?publisher ?abstract ?publisherName WHERE { ?publisher . ?publisher ?abstract . ?publisher ?publisherName . } Am I overlooking something obvious here? I thought that all the ?publisher resources would have an rdfs:label //Ed [1] [2] avmqBQ uThanks Gautier! I'm kind of confused about why filtering on language tag is necessary, but it certainly seems to work. //Ed On Tue, Oct 12, 2010 at 5:22 PM, Gautier Poupeau < > wrote:" "DBpedia session in Workshop Multilingual Linked Open Data for Enterprises" "uDear DBpedia community, developers & I18n committee The Multilingual LOD for Enterprises (MLODE) workshop [1] is very closely related to our work and Sebastian Hellmann and Steven Moran arranged a DBpedia session. Nothing is finalized yet so we are looking for your input. I'll try to coordinate the DBpedia community communication but I am also looking for 1-2 more persons for more active help - preferably (but not restricted) from the DBpedia I18n committee. The main task is to suggest speakers/topics and forward the workshop to people (or/and companies/enterprises in this case) that might be interested. Besides the coordinators you are of course all welcome to contribute (in addition to the former) in whatever way you can: - Ideas for presentations - ideas on promoting further the DBpedia project & I18n effort - Ideas on other bullets in this list ;) Anyway, this could also be a nice opportunity to get to know each other in person and ideally a representative from each DBpedia Chapter could attend. Waiting for your input, Best Dimitris Kontokostas [1] mlode" "DBpedia Editor Right" "uTo whom it may concern, I hope this email finds you well. I am writing to ask for editor rights for my account. The username is WaelComair. Sincerely yours, Wael Comair" "dbpedia data on lodview" "uHi dbpedians, I just want to share with you how dbpedia data looks like on lodview (my new opensource rdf browser) happy browsing ;) diego Hi dbpedians, I just want to share with you how dbpedia data looks like on lodview (my new opensource rdf browser) diego uPosting to the mailing list On 12/24/14 12:19 PM, Marco Fossati wrote: uAbsolutely yes! I just have to implement a 303 redirect based on a prefix (I use suffixes) to manage 'resource' vs 'page' and emulate current dbpedia behavior Really I don't like redirects for HTML representations and I developed LodView to manage content negotiation without it, yesterday I added a configurable redirect system based on suffixes, it is easy now to ask LodView to dereference Do you think that 303 is really a MUST in LOD publications? A lookup search is easy to implements for those who already expose a lookup service (like dbpedia) for the others a SPARQL query based on a regex could be very frustrating" "Playing with SPARQL endpoints" "uChristophe, I'm cc'ing the DBpedia list as this might be interesting to other people as well. On 17 Oct 2007, at 17:23, Christophe Tricot wrote: We use Jena in many projects. It's fairly big and complex, but in terms of quality and support and feature set by far the best Semantic Web framework for Java. When working with PHP, I use RAP, which also has a SPARQL client library [1]. When working from Javascript, I use Lee Feigenbaum's SPARQL client library [2]. But actually the SPARQL protocol is fairly simple. If you have a SPARQL query and a SPARQL endpoint, then it's very easy to construct the query URL and send an HTTP request and get back a SPARQL result in XML. The SPARQL XML result format is quite simple as well, so you can parse it with any off-the-shelf XML parser. Or use the JSON result format and process it with an off-the-shelf JSON tool. In other words, you can also get SPARQL results from DBpedia without using an external SPARQL client library. Best, Richard [1] [2] sparql_calendar_demo_a_sparql.html uRichard Cyganiak wrote: Christophe, You can also work with RDF via SPARQL using the OpenLink Ajax Toolkit (OAT) [1][2]. This toolkit possesses a Database Connectivity layer for SQL (via XMLA), RDF (via SPARQL), XML and XML based Web Services (REST or SOAP) called: Ajax Database Connectivity. It is also important to note that an entire Data aware Widget set exists for Ajax Database Connectivity supporting the different Data Access options (SQL, RDF, XML etc). In addition to the above, you may also consider working with Virtuoso's VSP (Virtuoso Server Pages) or VSPX (XML based declarative language variant of VSP) [3]. These technologies give you in process access to Virtuoso Data using similar constructs to the likes of PHP etc Finally, you can also use PHP inside Virtuoso should this be your preferred development environment :-) Links: 1. 2. 2. Kingsley" "wikipedia articles using intermediary template" "uHi all! I'm trying to use dbpedia to collect weather statistics for cities. Most of those use the Weather box wikipedia template and its parameters (the stats I need) are then mapped directly into the corresponding dbpedia page. For exemple we can find this in the source of the article for the city of Vancouver : *{{Weather box|location = [[Vancouver International Airport]]|metric first = Y|single line = Y|Jan maximum humidex = 17.2|Feb maximum humidex = 18.0}}* And corresponding data in the dbpedia page : *dbpprop:janMaximumHumidex : 17.200000dbpprop:febMaximumHumidex : 18* Which is what I'm using right now. But for a lot of cities, there is an intermediary template that seems to be really just a shortcut, like for Seattle , where the source on the wikipedia article has \"{{Seattle weatherbox}}\" and then in the Seattle weatherbox template wiki Template:Seattle_weatherbox> there is the actual Weather box template. For those cities (or so it seems), the weather stats don't end up in the corresponding dbpedia page (as is the case for Seattle ). So here are my questions : 1. Am I right in assuming that this \"shortcut\" template is the reason why the parameters of the weather box don't appear in dbpedia, that is, the dbpedia extractor just sees \"{{Seattle weatherbox}}\" and doesn't go looking into that template and basically ignores it? 2. Is there a way to get the data for those cities which use this double-template scheme? Thanks a bunch! Simon Hi all! I'm trying to use dbpedia to collect weather statistics for cities. Most of those use the Weather box wikipedia template  and its parameters (the stats I need) are then mapped directly into the corresponding dbpedia page. For exemple we can find this in the source of the article for the city of Vancouver : {{Weather box |location = [[Vancouver International Airport]] |metric first = Y |single line = Y |Jan maximum humidex = 17.2 |Feb maximum humidex = 18.0 }} And corresponding data in the dbpedia page  : dbpprop:janMaximumHumidex : 17.200000 dbpprop:febMaximumHumidex : 18 Which is what I'm using right now. But for a lot of cities, there is an intermediary template that seems to be really just a shortcut, like for Seattle , where the source on the wikipedia article has '{{Seattle weatherbox}}' and then in the Seattle weatherbox template there is the actual Weather box template. For those cities (or so it seems), the weather stats don't end up in the corresponding dbpedia page (as is the case for Seattle ). So here are my questions : 1. Am I right in assuming that this 'shortcut' template is the reason why the parameters of the weather box don't appear in dbpedia, that is, the dbpedia extractor just sees '{{Seattle weatherbox}}' and doesn't go looking into that template and basically ignores it? 2. Is there a way to get the data for those cities which use this double-template scheme? Thanks a bunch! Simon" "creating owl:sameAs" "uHi All, I want to generate the owl:sameAs link following the instruction in this link[1]. However, after I run the script (e.g. : sh interwiki_links.sh 'id' 'en') there's no owl:sameAs predicate in the outcome file. I only find the wikiPageInterLanguageLink predicate. Could you help me with this problem, I mean, perhaps there was some steps that I miss. Or is there any guidance/tutorial that I can follow other than the link I mentioned before? Here is the link of my shell script[2]. There are some modification, just to make the input file more suitable for my working environment.   [1]  [2]  Regards, Riko Hi All, I want to generate the owl:sameAs link following the instruction in this link[1]. However, after I run the script (e.g. : sh interwiki_links.sh 'id' 'en') there's no owl:sameAs predicate in the outcome file. I only find the wikiPageInterLanguageLink predicate. Could you help me with this problem, I mean, perhaps there was some steps that I miss. Or is there any guidance/tutorial that I can follow other than the link I mentioned before? Here is the link of my shell script[2]. There are some modification, just to make the input file more suitable for my working environment. [1] Riko uHi Riko, this script works with the old dump structure. You can use this one instead Best, Dimitris On Mon, Apr 15, 2013 at 8:10 AM, Riko Adi Prasetya < >wrote:" "Extraction framework french tutorial" "uHello ! I just wrote a tutorial about how to use the DBpedia extraction framework in french and just updated with the Wikidata extraction process. If you have any questions/comments about it don't hesitate to ask. Hope this will be usefull. Best. Julien. Hello ! I just wrote a tutorial about how to use the DBpedia extraction framework in french and just updated with the Wikidata extraction process. Julien." "URI Corrections!" "uAll, The corrected URIs are inserted below :-) This One > My Personal Data Space Server: End > I hope this helps everyone when using the SPARQL Query By Example" "collecting the List of cities per country from DBpedia" "uDear All, I want to create a list of cities per country from DBpedia. I first downloaded these two files from the website instance_types_en.ttl.bz2 , raw_infobox_properties_en.ttl.bz2 The first one has the types for all DBPedia entries and the second one has their relationships That is what I did 1- From the instance_types_en.ttl.bz2 file I extract the Cities. For example < Ankara has type of City 2- From raw_infobox_properties_en.ttl.bz2 file , I Extract the cities from country using these two patterns \"Country\"@en . or . \"United Kingdom\"@en . 3- Essentially, - I compile a list of cities from step 1, - In step 2, I Extract all triplets in raw_infobox_properties_en.ttl.bz2 that have a property field called country - I will then look at the left resource and if it exists in my city list, then I know the right hand side is a country - Then I will create a list of Cities per country The results are very poor, I dint get anything for US. For UK I got just one country, overall the results were disaster. Am I missing something here? Is there any way to compile the list of cities per country? I appreciate your comments Dear All, I want to create a list of cities per country from DBpedia. I first downloaded these two files from the website   instance_types_en.ttl.bz2 , raw_infobox_properties_en.ttl.bz2 The first one has the types for all DBPedia entries and the second one has their relationships  That is what I did 1- From the instance_types_en.ttl.bz2 file I extract the Cities. For example < comments" "about dbpedia2:birthPlace" "uHi, For a given subject, there could be multiple objects for predicate dbpedia2:birthPlace, for instance, British Columbia Quesnel, British Columbia Canada My problem is how to put these birthplace together to form a final one, why not use 'Quesnel, British Columbia, Canada' as its value in dbpedia directly at the information extraction step?" "What do we call high-level categories in the dbpedia ontology?" "uThis might be a dumb question, but here goes In generic databases there are some very broad categories that are useful, in particular, the kind of high level categories that are in the dbpedia ontology such as * Person * Place * Creative Work, etc. * Organization What are these categories (as a group) called? I see these as important because they cut across subjects: \"Physics\", \"Baseball\" or \"Feminist Movement\" are going to contain instances of all of the above. Note that the Dbpedia ontology is focused on things that are relatively concrete. A number of wikipedia articles reference \"Concepts\", \"Inventions\" or other things that are harder to pin down than the above concrete categories, for instance, or If more of wikipedia were to be categorized, the dbpedia ontology would need to be extended to deal with this sort of thing. What do we call the set of high-level categories that includes the current dbpedia ontology types + more abstract things?" "DBpedia Open Text Extraction Challenge - TextExt" "u*DBpedia Open Text Extraction Challenge - TextExt* Website: *_Disclaimer: The call is under constant development, please refer to the news section. We also acknowledge the initial engineering effort and will be lenient on technical requirements for the first submissions and will focus evaluation on the extracted triples and allow late submissions, if they are coordinated with us_*. Background DBpedia and Wikidata currently focus primarily on representing factual knowledge as contained in Wikipedia infoboxes. A vast amount of information, however, is contained in the unstructured Wikipedia article texts. With the DBpedia Open Text Extraction Challenge, we aim to spur knowledge extraction from Wikipedia article texts in order to dramatically broaden and deepen the amount of structured DBpedia/Wikipedia data and provide a platform for benchmarking various extraction tools. Mission Wikipedia has become the ubiquitous source of knowledge for the world enabling humans to lookup definitions, quickly become familiar with new topics, read up background infos for news event and many more - even settling coffee house arguments via a quick mobile research. The mission of DBpedia in general is to harvest Wikipedia’s knowledge, refine and structure it and then disseminate it on the web - in a free and open manner - for IT users and businesses. News and next events Twitter: Follow @dbpedia , Hashtag: #dbpedianlp * LDK conference joined the challenge (Deadline March 19th and April 24th) * SEMANTiCS joined the challenge (Deadline June 11th and July 17th) * Feb 20th, 2017: Full example added to this website * March 1st, 2017: Docker image (beta) Coming soon: * beginning of March: full example within the docker image * beginning of March: DBpedia full article text and tables (currently only abstracts) Methodology The DBpedia Open Text Extraction Challenge differs significantly from other challenges in the language technology and other areas in that it is not a one time call, but a continuous growing and expanding challenge with the focus to *sustainably* advance the state of the art and transcend boundaries in a *systematic* way. The DBpedia Association and the people behind this challenge are committed to provide the necessary infrastructure and drive the challenge for an indefinite time as well as potentially extend the challenge beyond Wikipedia. We provide the extracted and cleaned full text for all Wikipedia articles from 9 different languages in regular intervals for download and as Docker in the machine readable NIF-RDF format (Example for Barrack Obama in English ). Challenge participants are asked to wrap their NLP and extraction engines in Docker images and submit them to us. We will run participants’ tools in regular intervals in order to extract: 1. Facts, relations, events, terminology, ontologies as RDF triples (Triple track) 2. Useful NLP annotations such as pos-tags, dependencies, co-reference (Annotation track) We allow submissions 2 months prior to selected conferences (currently _ that fulfil the technical requirements and provide a sufficient description will be able to present at the conference and be included in the yearly proceedings. *Each conference, the challenge committee will select a winner among challenge participants, which will receive 1000€. * Results Every December, we will publish a summary article and proceedings of participants’ submissions at _ proceedings are planned to be published in Dec 2017. We will try to briefly summarize any intermediate progress online in this section. Acknowledgements We would like to thank the Computer Center of Leipzig University to give us access to their 6TB RAM server Sirius to run all extraction tools. The project was created with the support of the H2020 EU project HOBBIT (GA-688227) and ALIGNED (GA-644055) as well as the BMWi project Smart Data Web (GA-01MD15010B). Challenge Committee * Sebastian Hellmann, AKSW, DBpedia Association, KILT Competence Center, InfAI, Leipzig * Sören Auer, Fraunhofer IAIS, University of Bonn * Ricardo Usbeck, AKSW, Simba Competence Center, Leipzig University * Dimitris Kontokostas, AKSW, DBpedia Association, KILT Competence Center, InfAI, Leipzig * Sandro Coelho, AKSW, DBpedia Association, KILT Competence Center, InfAI, Leipzig Contact Email: uI applaud this initiative to extract triples from Wikipedia open text. However, it would be useful to initiate a parallel challenge/effort to represent a limited portion of the current Wikipedia article text as semantic representation, eliminating the text altogether. In this approach, the Wikipedia information would be semantically encoded as its original representation, as opposed to using text to represent the information. A small subset of Wikipedia subject matter could be used for this experiment. After the limited Wikipedia domain of interest was fully semantically represented, tools could be developed to translate the semantic representation into human readable text. It seems over the long run creating the original knowledge as a semantic representation, instead of text, would result in a Wikipedia knowledge base that upon query by humans could automatically perform the necessary translation into text in whichever human language the user desired. This concept would also facilitate machine to machine use of the Wikipedia knowledge base, which is currently difficult, if not impossible, due to the textual nature of the information. You could also envision tools that would eventually make it easy for authors to source the article information directly in semantic representation. The end results would be a DBpedia on steroids and the eventually elimination of Wikipedia as the original article text sources would no longer be needed. John Flynn From: Sebastian Hellmann [mailto: ] Sent: Monday, March 06, 2017 5:56 AM To: DBpedia Subject: [DBpedia-discussion] DBpedia Open Text Extraction Challenge - TextExt DBpedia Open Text Extraction Challenge - TextExt Website: Disclaimer: The call is under constant development, please refer to the news section. We also acknowledge the initial engineering effort and will be lenient on technical requirements for the first submissions and will focus evaluation on the extracted triples and allow late submissions, if they are coordinated with us. Background DBpedia and Wikidata currently focus primarily on representing factual knowledge as contained in Wikipedia infoboxes. A vast amount of information, however, is contained in the unstructured Wikipedia article texts. With the DBpedia Open Text Extraction Challenge, we aim to spur knowledge extraction from Wikipedia article texts in order to dramatically broaden and deepen the amount of structured DBpedia/Wikipedia data and provide a platform for benchmarking various extraction tools. Mission Wikipedia has become the ubiquitous source of knowledge for the world enabling humans to lookup definitions, quickly become familiar with new topics, read up background infos for news event and many more - even settling coffee house arguments via a quick mobile research. The mission of DBpedia in general is to harvest Wikipedia’s knowledge, refine and structure it and then disseminate it on the web - in a free and open manner - for IT users and businesses. News and next events Twitter: Follow @dbpedia , Hashtag: #dbpedianlp · LDK conference joined the challenge (Deadline March 19th and April 24th) · SEMANTiCS joined the challenge (Deadline June 11th and July 17th) · Feb 20th, 2017: Full example added to this website · March 1st, 2017: Docker image (beta) Coming soon: · beginning of March: full example within the docker image · beginning of March: DBpedia full article text and tables (currently only abstracts) Methodology The DBpedia Open Text Extraction Challenge differs significantly from other challenges in the language technology and other areas in that it is not a one time call, but a continuous growing and expanding challenge with the focus to sustainably advance the state of the art and transcend boundaries in a systematic way. The DBpedia Association and the people behind this challenge are committed to provide the necessary infrastructure and drive the challenge for an indefinite time as well as potentially extend the challenge beyond Wikipedia. We provide the extracted and cleaned full text for all Wikipedia articles from 9 different languages in regular intervals for download and as Docker in the machine readable NIF-RDF format (Example for Barrack Obama in English ). Challenge participants are asked to wrap their NLP and extraction engines in Docker images and submit them to us. We will run participants’ tools in regular intervals in order to extract: 1. Facts, relations, events, terminology, ontologies as RDF triples (Triple track) 2. Useful NLP annotations such as pos-tags, dependencies, co-reference (Annotation track) We allow submissions 2 months prior to selected conferences (currently Results Every December, we will publish a summary article and proceedings of participants’ submissions at Acknowledgements We would like to thank the Computer Center of Leipzig University to give us access to their 6TB RAM server Sirius to run all extraction tools. The project was created with the support of the H2020 EU project HOBBIT (GA-688227) and ALIGNED (GA-644055) as well as the BMWi project Smart Data Web (GA-01MD15010B). Challenge Committee · Sebastian Hellmann, AKSW, DBpedia Association, KILT Competence Center, InfAI, Leipzig · Sören Auer, Fraunhofer IAIS, University of Bonn · Ricardo Usbeck, AKSW, Simba Competence Center, Leipzig University · Dimitris Kontokostas, AKSW, DBpedia Association, KILT Competence Center, InfAI, Leipzig · Sandro Coelho, AKSW, DBpedia Association, KILT Competence Center, InfAI, Leipzig Contact Email: uIsn't that Wikidata? uDear John, yes, your idea is actually what you can do with Wikidata already. Like you could replace part of the text with Wikidata information, which goes in the direction of your proposal. However, although the feature exists, I was unable to find any example of this in the article text, but I only looked briefly. Maybe somebody else? Not sure, whether you can tell whole articles, stories or essays in a semantic language. A lot of information would be lost, if you do it in OWL. Also it seems very hard to encode it even using something like Attempto Controlled English: Our motivation is that we do try getting more information with relation extraction for now until something better presents itself. all the best, Sebastian On 07.03.2017 15:31, Paul Houle wrote:" "Open Calls - Formal Ontologies meets Industry, Rule Challenge, Reasoning Web Summer School" "uDear Colleagues, RuleML 2015 ( August 2-5, co-located with the Conference on Automated Deduction (CADE), the Workshop on Formal Ontologies meet Industry (FOMI), the Conference on Web Reasoning and Rule Systems (RR) and the Reasoning Web Summer School (RW). The International Web Rule Symposium (RuleML) has been a leading international conference on research, applications, languages and standards for rule technologies. Since 2002 the RuleML event series has built bridges between academia and industry in the field of rules and its applications, especially as part of the semantic technology stack. Submit your work on Formal Ontologies meets Industry to the FOMI 2015 workshop and your rule-based business cases, solutions, demos, results, rule bases to the RuleML 2015 industry track and the 9th International Rule Challenge and learn more about \"Web Logic Rules\" at the 11th Reasoning Web Summer School. Phd students can also discuss their research at the joint RuleML and RR Doctoral Consortium. The open deadlines for your submissions are as follows: * 7th Workshop on Formal Ontologies meets Industry ( * 9th International Rule Challenge ( deadline May 23rd * RuleML Rulebase Competition ( deadline May 23rd * Challenge on Recommender Systems for the Web of Data ( * 5th RuleML Doctoral Consortium ( * 11th Reasoning Web Summer School ( May 10th * RR 2015 Doctoral Consortium ( * Industry Track (http://2015.ruleml.org/industrytrack.html) - deadline April 30th ++++++++++++ News ++++++++++++++++ - Keynotes and Invited Talks by Michael Genesereth on The Herbrand Manifesto - Thinking Inside the Box, Thom Fruehwirth on Constraint Handling Rules and Avigdor Gal on When Processes Rule Event Streams http://2015.ruleml.org/tutorials.html/keynotes.html - Tutorial Day http://2015.ruleml.org/tutorials.html - Standards Meetings, e.g. OMG API4KB, ISO Common Logic, OASIS LegalRuleML - Berlin Semantic Web Meetup http://www.meetup.com/The-Berlin-Semantic-Web-Meetup-Group/ - Industry Track http://2015.ruleml.org/industrytrack.html - Co-located with: CADE 2015, RR 2015, Reasoning Web 2015, FOMIS 2015 and further CADE workshops - Sponsors and Partners: ECCAI, AAAI, W3C, OMG, OASIS LegalXML, Association for Logic Programming, IEEE Technical Committee on Semantic Computing, IFCoLog, Signavio, Model Systems, Coherent Knowledge, Binarypark, ShareLatex, Corporate Semantic Web, Springer LNCS, Athan Services We are looking forward to meeting you in Berlin, Germany in August! Website: http://2015.ruleml.org/ LinkedIn: https://www.linkedin.com/groups/RuleML-Group-2190838 Facebook: https://www.facebook.com/RuleML Twitter hashtag: #ruleml2015 Blog: http://blog.ruleml.org Wiki: http://ruleml.org" "how to inference?" "uHi Hugh, Always got 'inference context does not exist' error, could you please take a look? Here shows my operations: :/disk1/sda/jchen/Structured$ cat scripts/inference rdfs_rule_set(' :/disk1/sda/jchen/Structured$ cat scripts/conflict.sparql sparql define input:inference ' PREFIX owl: PREFIX xsd: PREFIX rdfs: PREFIX rdf: PREFIX foaf: PREFIX dc: PREFIX : PREFIX dbpedia2: PREFIX dbpedia: PREFIX skos: SELECT count(*) from WHERE { ?x rdf:type . }; :/disk1/sda/jchen/Structured$ cat scripts/inference | isql 1111 Connected to OpenLink Virtuoso Driver: 05.08.3034 OpenLink Virtuoso ODBC Driver OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. SQL> Done. uHi juisheng, I presume you are trying to build on the inferencing example in our online documentation at: rdfsparqlrule.html#rdfsparqlruleintro which works for me and presumably works for you also ? Does the graph and triple set for the graph rule used in your rdfs_rule_set() call exist ie sparql/' as the error implies it doesn't which I also get not having such a graph in my local instance, but if i then create an arbitrary graph of that name with the ttlp() function (using the data in the documentation example for instance) and run you query then it does run successfully even though it obviously returns an empty result set as I don't have any dbpedia data in my test server, but proves my point: $ cat chen sparql define input:inference ' inference' PREFIX owl: PREFIX xsd: PREFIX rdfs: PREFIX rdf: PREFIX foaf: PREFIX dc: PREFIX : PREFIX dbpedia2: PREFIX dbpedia: PREFIX skos: SELECT count(*) from WHERE { ?x rdf:type . }; $ cat chen | ////bin/isql 1112 Connected to OpenLink Virtuoso Driver: 05.00.3033 OpenLink Virtuoso ODBC Driver OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. SQL> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> Type the rest of statement, end with a semicolon (;)> callret-0 INTEGER 0 1 Rows. uHi Hugh, Yes, I have graph ' local. :~$ isql 1111 Connected to OpenLink Virtuoso Driver: 05.08.3034 OpenLink Virtuoso ODBC Driver OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. SQL> sparql select count(*) from { ?s ?p ?o}; callret-0 INTEGER 9124904 1 Rows. uHi Juisheng, Strange indeed, as the graph clearly exists and is accessible I shall see if anyone internal might have any other possible suggestions as to the cause Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 11 Sep 2008, at 12:13, jiusheng chen wrote: uHi Hugh, I use the Virtuoso open source v5.0.8, is it the reason? On Thu, Sep 11, 2008 at 7:44 PM, Hugh Williams < >wrote: uHi Jiusheng, The live dbpedia instance is hosted on a Virtuoso 5.0.8 build so I would not expect this to be an issue. I presume you have a local instance of dbpedia create from the available datasets and hosted in Virtuoso ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 11 Sep 2008, at 14:03, jiusheng chen wrote: uHi Hugh, Yes, you are right, we have a local Virtuoso instance hosting DBpedia datasets. is it the configuration issue on the server side? we changed values for MaxCheckpointRemap NumberOfBuffers MaxDirtyBuffers TransactionAfterImageLimit is there any parameter used to control inference? On Thu, Sep 11, 2008 at 10:51 PM, Hugh Williams < >wrote: uHi Juisheng The documentation URL ( rdfsparqlrule.html#rdfsparqlruleintro) I sent you yesterday includes and example on setting up inferencing for DBpedia as has been done on the live instance we host, have you tried using the example against your local instance ? There doesn't appear be any specific configuration file parameters required for controlling inferencing support. Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 12 Sep 2008, at 01:38, jiusheng chen wrote: u" "simple LOD browser for a a demo system" "uCan anyone recommend software to stand up a simple linked data browser for a demonstration system we plan on hacking together next week? What we need is something very simple, much like the the code that produces the HTML for Our project is focused on extracting information from text, linking entities to existing linked data instances, mapping new extracted facts to the linked data vocabularies and adding the results to a single local Jena-based triple store that includes DBpedia content and some other linked data. We want to use a Web-browser to show what the KB knows for an entity before processing some text and then what it has for the entity after we've added new facts extracted from a set of text documents. Is the code behind the dbpedia.org web service available? My searching failed to find it. Any advice anyone can provide will be appreciated. Thanks, Tim uIl 11/01/2011 18:23, Tim Finin ha scritto: Hi Tim, the HTML version of a DBpedia resource is developed in the same way as Pubby [1]. Among other things, it provides a simple HTML interface showing the data available about each resource. Hope it helps, bye, roberto [1] uAm 11.01.2011 18:23, schrieb Tim Finin: The code behind the DBpedia resource pages is based on Disco: If you want something a little less puristic you might also try OntoWiki: It has a class-hierarchy browser, faceted-browsing, map views, customized views as well as various filter options built in. It allso supports the full range of LOD best-practices from content-negotiation, over SPARQL endpoint, Semantic Pingback to OpenID and FOAF+SSL. Best, Sören uHi, the code is available, see step 9 in [1] (dbpedia vad plugin) but i don't think it will very helpful since it based on Virtuoso Server Pages (VSP) and you are working with jena if you want to work only with dbpedia datasets (and virtuoso) it will work as is, otherwise it will probably need modifications regards, Jim [1] Hi, > Is the code behind the dbpedia.org web service available?  My > searching failed to find it.  Any advice anyone can provide will be > appreciated. the code is available, see step 9 in [1] (dbpedia vad plugin) but i don't think it will very helpful since it based on Virtuoso Server Pages (VSP) and you are working with jena if you want to work only with dbpedia datasets (and virtuoso) it will work as is, otherwise it will probably need modifications regards, Jim [1] VirtBulkRDFLoaderExampleDbpedia uOn 11/01/11 17:23, Tim Finin wrote: how are you accessing your jena data - with joseki ? 1) have some canned sparql in a html link, so a before and after 2) twinkle ? [1] (dont think its being actively maintained) otherwise, you need a way of content negotiating an URI for a subject in your data - are your URIs dereferencable - pubby can help you here, and then you could use marbles [2] say, or you might be able to install a firefox plugin like Tabulator [3] - (you may need to hack the installation files so that it appears compatible with latest versions) - or write your own code to display the results. [1] [2] [3] uOn 1/11/11 12:23 PM, Tim Finin wrote: Tim, A few questions: I assume your backend supports SPARQL? In particular, DESCRIBE and CONSTRUCT queries? In addition, do you have resolvable URIs for entities native to your triple store? Is coding mandatory to your endeavor? Are you set on a particular triple store? There are many routes to your destination. I can suggest path once I have better context re. your actual needs. Happy New Year!" "License of dbpedia content" "uHoi, I have had a look at dbpedia and I like it a lot. What I am wondering about is the motivation for the license. Is the license chosen because Wikipedia is available under the GFDL and consequently keeping this data under the GFDL prevents any potential issues? If so many in the Wikimedia Foundation that the choice for the GFDL is only because there were no real alternatives for it at the time when Wikipedia was started. In my opinion, by performing data-mining on Wikipedia you create a new work and consequently it does not need to be provided under the GFDL. Also when someone else takes the routines that are part of what dbpedia is, it will be his option to choose a licence. For me this is a very practical question, the project I work on, OmegaWiki, provides its data that is in the \"community dataset\" under a combined GFDL/CC-by license. The rationale for this is that we want people to know where to go when they use our data for updates or for amending our data. We allow for other datasets and these do not have to be under either license. The only thing we insist on is that changes will to these datasets occur in our community dataset and will be under the GFDL/C-by license. When you also consider that you cannot copyright facts, it is clear that there is relatively little merit in copyrighting in the first place. What you can copyright is a collection of facts but by doing so, dbpedia would make itself as a resource less valid and valuable. For Open Access the best license is considered by many to be the CC-by. Thanks, Gerard Meijssen Hoi, I have had a look at dbpedia and I like it a lot. What I am wondering about is the motivation for the license. Is the license chosen because Wikipedia is available under the GFDL and consequently keeping this data under the GFDL prevents any potential issues? If so many in the Wikimedia Foundation that the choice for the GFDL is only because there were no real alternatives for it at the time when Wikipedia was started. In my opinion, by performing data-mining on Wikipedia you create a new work and consequently it does not need to be provided under the GFDL. Also when someone else takes the routines that are part of what dbpedia is, it will be his option to choose a licence. For me this is a very practical question, the project I work on, OmegaWiki, provides its data that is in the 'community dataset' under a combined GFDL/CC-by license. The rationale for this is that we want people to know where to go when they use our data for updates or for amending our data. We allow for other datasets and these do not have to be under either license. The only thing we insist on is that changes will to these datasets occur in our community dataset and will be under the GFDL/C-by license. When you also consider that you cannot copyright facts, it is clear that there is relatively little merit in copyrighting in the first place. What you can copyright is a collection of facts but by doing so, dbpedia would make itself as a resource less valid and valuable. For Open Access the best license is considered by many to be the CC-by. Thanks, Gerard Meijssen uHi Gerard, On 10/20/07, GerardM < > wrote: At least under Dutch law this is not true. While the facts indeed cannot be copyrighted, computer extracting of facts *is* breach of copyright. (Case in Dutch court: someone had the Dutch phone library (detelefoongids) copied in India manually which *is* OK. More expensive, but does not involve computers copying the information) Egon uHoi, It does not apply because in the case of the \"telefoonboek\" the information that was retrieved was exactly the same as used for the application. In this case we have a text and derive relations from it. In the process all the prose is dropped. Thanks, Gerard On 10/20/07, Egon Willighagen < > wrote: uGerard, On 20 Oct 2007, at 10:43, GerardM wrote: Exactly. [snip] I think this is very bad advice. Our dataset contains the first paragraph of every Wikipedia article. That's not a new work, it's derivative. Hence it has to be published under the GFDL. [snip] By the reasoning above, CC-by is not an option for some parts of the dataset. The case is less clear for other parts, but before we think about applying different licenses for different parts of the dataset, we would like to see some concrete evidence that the GFDL causes problems for potential DBpedia users. Richard uRichard Cyganiak wrote: All, The critical point here (apropos Richard's comments): What is the perceived damaging effect of GFDL on DBpedia Users and developers of applications based DBpedia's rendition of Wikipedia data as RDF based structured linked data? uHoi, The most relevant aspect of dbpedia is in my opinion in the opportunity to link into other ontologies. Because of its GFDL license you may be able to get permission to link other resources into dbpedia. When work is done on the data that leads to what amount to improvements for the other party, it can not take these improvements and incorporate them into its resource because of the viral aspects of the GFDL license. A case in point, in the database of the OmegaWiki environment, we allow for multiple datasets. Two of these datasets are of scientific importance. When our community dataset is only available under the GFDL, new information in our community dataset that may be of interest to the Swiss Institute of Bioinformatics or the National Library of Medicine. This would be utterly unacceptable and we would not be able to collaborate with them. There are however two distinct issues that I raised; - I asked you to consider the license of dbpedia itself - The license of the result of a run of the dbpedia software itself is not necessarily under the GFDL Potential cooperation will not be as strong and vibrant when people are forced to run the dbpedia software themselves and pick what is of interest to them without giving back. Most valuable is the cooperation and the extension of data that becomes available by connecting resources. Thanks, Gerard On 10/20/07, Kingsley Idehen < > wrote: uGerardM wrote: Gerard, DBpedia is about an RDF Linked Data Hub (a Data Junction Box of sorts for the Data Web). I don't see how the GFDL impairs your ability to use DBpedia URIs in your work. The aim here is to virally propagate Open Linked Data in line the an overarching Open Data vision for the Web. If you were to import DBpedia data into a local system etc. And then produce a variant of the data set without derferencable URIs or work with customers that seek such a solution, then I believe this is contrary to the aims of DBpedia. If you look at Wikipedia derivatives, the data is never locked. The key here is to never morph the data into a format that is locked. That said, I am replying based on the spirit of the project rather than having combed through th GFDL :-) As I recall responding in the past about the spirit of iODBC in relation to GPL only to find the GPL incongruent with what I espoused which lead to a serious flame war between Richard Stallman and I, which ultimately lead to a data access regression in the SQL realm as demonstrated by: 1. iODBC and unixODBC being two projects that offer the same thing with little value on the constructive competition front 2. A LAMP Stack that is DBMS specific (*negative implications are still working their way through to the surface across the user and developer realms*) Kingsley uWhat damage?? You're free to take free data if you like. The GFDL does not prevent you to do anything, it only creates new possibilities. If you don't like the GNU FDL (or any license), then build a resource like that yourself (but that's likely not as cheap as using the GFDL-ed data). Egon uEgon, On 21 Oct 2007, at 09:17, Egon Willighagen wrote: Huh? Did anyone mention damage? As I understand it, that's not true. The GFDL prevents me, for example, from including Wikipedia excerpts in a publication that contains content licensed under certain CC licenses. Oh, if I'm not happy with Wikipedia's license, I should just go away and build my own Wikipedia. That is not a very practical suggestion. Gerard brought up a valid concern. I hope that we can discuss the issues surrounding DBpedia licensing without knee-jerk pro- or anti- GFDL zealotry. A much more interesting question is wether the DBpedia datasets constitute new works that can be licensed any way we want, or wether they are derivative works and therefore *must* be GFDL-licensed to comply with the Wikipedia license. Can you shed any light on that, Egon? Thanks, Richard uOn 10/21/07, Richard Cyganiak < > wrote: Yes, Kingsley Idehen did. Accidentally removed his quote. Here it is: \"The critical point here (apropos Richard's comments): What is the perceived damaging effect of GFDL on DBpedia Users and developers of applications based DBpedia's rendition of Wikipedia data as RDF based structured linked data?\" I would rather rephrase that and say: \"The GFDL allows me to use Wikipedia excerpts in a publication, as long as the license of derivatives is compatible\". It's like: you can come to my party, IF you behave nicely. I don't think that really restricts you. Very practical, actually. It just takes much time; effort put in by people who believe in the GFDL license. I think (I'm not a lawyer) that DBPedia *is* a derivate under Dutch law, as machines have been involved to derive information from the original information. My reasoning is: the information in DBPedia is 1:1 based on the WP content: if the WP content would make different triples, DBPedia would follow that. The fact that there is no (manual) interpretation involved, only strengthens my believe that DBPedia is really no new 'work'. There are also some rules for how much you may cite. DBPedia also cites much more statements from WP than any legal number would allow, making DBPedia really a derivative. Egon uHoi, I disagree with Egon. There is substantial effort invested in creating the routines that produce the data that is dbpedia. Also the data is substantially different from the information that is in Wikipedia. I have discussed issues like this with several people in the Wikimedia Foundation and their attitude is typically that they want people to do good with the WMF data. Also by suggesting that dbpedia is only something that is based on Wikipedia, you deny its potential. Where it says that GFDL allows things as long as it is in a compatible license, it is wrong, there are no compatible licenses to the GFDL. There is some work going on to make the GFDL compatible with other licenses but from what I understand this is not going anywhere fast. It is not even likely to get anywhere. Thanks, Gerard On 10/21/07, Egon Willighagen < > wrote: uOn 21 Oct 2007, at 12:05, GerardM wrote: But does the amount of invested effort matter at all? Does a derivative work stop being a derivative work if sufficient effort has been spent on it? How much effort? That's debatable. Every single RDF literal in DBpedia is lifted from some place in Wikipedia, by a conceptually fairly simple mechanical process. I have no doubt about that. And perhaps they have not chosen the best possible license for that goal. This doesn't change the fact that any re-use of Wikipedia content is subject to the GFDL. I don't think being based on Wikipedia takes away from DBpedia's potential. Here's another interesting data point: Freebase uses Wikipedia content. Their licensing page [1] states: “Texts in Freebase are documents with potentially many authors. Texts may have one of two different licenses: [] GNU Free Documentation License (GFDL). Descriptions that are automatically summarized from Wikipedia articles are licensed under the GFDL, which requires re- publishers of the summary to credit Wikipedia. All Freebase topics that display material from Wikipedia articles provide explicit reference to the GFDL and the Wikipedia.” But the Freebase folks don't seem to think that this “infects” the rest of their content and data, which is CC-by licensed. Richard [1]" "How the property occurrences is counted" "uHi DBpedian, On the statistics report of DBpedia, e.g. A number on top of the report looks confusing to me: *6.32 % of all templates in Wikipedia (en) are mapped (368 of 5826).3.47 % of all properties in Wikipedia (en) are mapped (6169 of 177599).* It reads like Wikipedia(en) has 5826 templates in total, but from my observation, the number of templates on enwiki is much more than that. Could you show me how these metrics are defined? Thanks uHi Jiang, You are right, this is not the total number of Wikipedia templates. We have some regex patterns [1] that exclude some templates before we calculate the statistics. Best, Dimitris [1] On Tue, Sep 17, 2013 at 7:08 PM, Jiang BIAN < > wrote:" "Use core 3.9 in sbt with scala 2.10.2" "uHi, I would like to use the core of the extraction framework with sbt. I cloned the git repository, compiled the lib with maven and then copied then core-3.8.jar to my sbt project folder. It works for a simple test tool. However, I want to use Slick and need Scala 2.10.2 for that. In a release not it says that core 3.9. uses scala 2.10.2 but mvn compile results in a 3.8 that uses scala 2.9.2. What am I doing wrong? thx for your help, Karsten Hi, I would like to use the core of the extraction framework with sbt. I cloned the git repository, compiled the lib with maven and then copied then core-3.8.jar to my sbt project folder. It works for a simple test tool. However, I want to use Slick and need Scala 2.10.2 for that. In a release not it says that core 3.9. uses scala 2.10.2 but mvn compile results in a 3.8 that uses scala 2.9.2. What am I doing wrong? thx for your help, Karsten uHi Karsten, the code used to build DBpedia 3.9 (which has version 3.9 as well) has not been merged to the master branch yet. I think you can use the \"dump\" branch for a quick test (it uses Scala 2.10.2) but please note that it is not aligned with master. Hope this helps. Cheers Andrea 2013/11/19 Karsten Jeschkies < > uHi, the dump branch seems to work. It would be nice to add the extractor framework to Ivy as well. sbt uses Ivy if I am not mistaken. Thanks for the quick help, Karsten On 19 November 2013 12:27, Andrea Di Menna < > wrote: uHi Karsten, yes adding the framework on Maven central would be a cool enhancement. Yes sbt uses Ivy, but as the dependency resolution manager. What you really want is the DBpedia framework on a public maven repo though :p Cheers Andrea 2013/11/19 Karsten Jeschkies < >" "likely URI error in Turtle output for Buddy_Guy" "uHow to reproduce : wget uOn Sep 14, 2014 1:11 PM, \"Jean-Marc Vanel\" < > wrote: uThanks Jona for the quick sunday response ! Of course using the full IRI between is fine; anyway this is already done often, like: dbpprop:artist dbpedia:Buddy_Guy . I wonder how long it will take to change the server. If understand well, the problem is in the server code, not in the data . No hurry on my side, I've changed my test using dbpedia . 2014-09-14 17:51 GMT+02:00 Jona Christopher Sahnwaldt < >: uThanks JC and Jean I submitted an issue on VOS for this case On Sun, Sep 14, 2014 at 7:08 PM, Jean-Marc Vanel < > wrote:" "Travis CI failing on pull requests" "uHi, does anyone know why this is happening? Cheers Andrea 2013/3/20 Andrea Di Menna < >" "Trouble with my use cases" "uHi I recently fixed both demos written by me and featured on the Use Cases page. These had broken due to vocab changes in dbpedia and a problem with my server as discussed in my blog Checking again today, they are broken again: Football - for a footballer, the property dbpprop:currentclub was reasonably well populated to reflect the current players in a team. Now that property is present for only a few players, too few to be be useful. Some teams now have dbpprop:name referencing a player which is wrong. One problem here is that the relationship between a player and a team is not a resource. This leads to multiple disassociated dbpprop:name, dbpprop:no (position) ,dbpprop:nat and dbpprop:other properties - see membership was represented as a resource with these properties, then the data would be meaningful and moreover dates could be added to allow team composition at any time to be constructed. Albums - when I wrote this a couple of years ago, thumbnail pointed to a jpg and this demo looked rather nice, especially the Simile timeline. When I last fixed it, the thumbnails had gone and only a local file name for the cover art was available as dbpprop:cover. Now thumbnails are back, but the URI is garbled: should be Regards Chris Hi I recently fixed both demos written by me and featured on the Use Cases page. These had broken due to vocab changes in dbpedia and a problem with my server as discussed in my blog Checking again today, they are broken again: Football - for a footballer, the property dbpprop:currentclub was reasonably well populated to reflect the current players in a team. Now that property is present for only a few players, too few to be be useful.  Some teams now have dbpprop:name referencing a player which is wrong. One problem here is that the relationship between a player and a team is not a resource.  This leads to multiple disassociated dbpprop:name, dbpprop:no (position) ,dbpprop:nat and dbpprop:other properties - see as an example. If team membership was represented as a resource with these properties, then the data would be meaningful and moreover dates could be added to allow team composition at any time to be constructed. Albums - when I wrote this a couple of years ago, thumbnail pointed to a jpg and this demo looked rather nice, especially the Simile timeline.  When I last fixed it, the thumbnails had gone and only a local file name for the cover art was available as dbpprop:cover.  Now thumbnails are back, but the URI is garbled: Chris" "linked geo data?" "uHi there, what do you think about storing monuments and other interesting amenities in dbpedia? mike uOpenStreetMaps -> RDF = Is this, what you are looking for? I think, it is synchronized live (all changes to OSM become RDF in a couple of days), but you would need to ask there. Sebastian Am 05.08.2012 14:00, schrieb Mike Dupont: uI am talking about dbpedia, osm is changing its license and not going to be compatible. I am working on fosm.org to rescue lost data. If you have good hosting I would be interested in working on putting not all data but important data in the dbpedia. as i said monuments etc could be synched. mike On Sun, Aug 5, 2012 at 12:16 PM, Sebastian Hellmann < > wrote: uHello Mike, I am having difficulties understanding you. OSM is changing its license to what? and how do you know, can you share a link? Wikipedia -> RDF = DBpedia means, that you would have to include your data in Wikipedia for us to include it, I guess. Is this what you intend to do? Is fosm already in RDF? All the best, Sebastian Am 05.08.2012 14:44, schrieb Mike Dupont: uOsm is changing the license to one that is not compatible in any way with wikipedia. I would create the rdf data, that is not a problem. I could create my own transformation or reuse the linkedgeodata code, I have been studying rdf and semweb for 10 years now. It is just that I have not had or seen any good hosting of rdf that was publically usable before. mike On Sun, Aug 5, 2012 at 1:48 PM, Sebastian Hellmann < > wrote: uHave you looked at TheDataHub.org? They may be able to host the RDF dumps. For Linked Data and SPARQL, perhaps the LinkedGeoData folks will be able to help you. Have you contacted them? DBpedia will be able to host the links from DBpedia identifiers to FOSM identifiers. But hosting other facts derived from 3rd-party sources (other than Wikipedia) would be a new thing for us. I am not against it, but we'd need good documentation, update processes, and visual cues for users browsing the data. It seems to be already too confusing with our very few extractors. I don't know what the others think. But being an aggregator of other datasets that overlap with DBpedia is a possible evolution path for the project. It already works as such a nucleous in a distributed manner, but perhaps hosting some other prominent sources in situ and effectively tracking/displaying provenance could be an interesting new dimension. On Aug 5, 2012 3:57 PM, \"Mike Dupont\" < > wrote: Have you looked at TheDataHub.org? They may be able to host the RDF dumps. For Linked Data and SPARQL, perhaps the LinkedGeoData folks will be able to help you. Have you contacted them? DBpedia will be able to host the links from DBpedia identifiers to FOSM identifiers. But hosting other facts derived from 3rd-party sources (other than Wikipedia) would be a new thing for us. I am not against it, but we'd need good documentation, update processes, and visual cues for users browsing the data. It seems to be already too confusing with our very few extractors. I don't know what the others think. But being an aggregator of other datasets that overlap with DBpedia is a possible evolution path for the project. It already works as such a nucleous in a distributed manner, but perhaps hosting some other prominent sources in situ and effectively tracking/displaying provenance could be an interesting new dimension. On Aug 5, 2012 3:57 PM, 'Mike Dupont' < > wrote: uI have looked at ckan, yes. It is not for hosting triples of data, it is for hosting information about the data. I would like to have the data of wikipedia and be able to merge it/query it with geographic data. I am looking forward to having large wikimedia hosted servers that are usable. I have long dreamed about a real semantic web and this looks like it will be getting close. I dont have the HW resources to process this large amount of data. I started with a large number of articles on geographic points, on cities and towns of kosovo. Sources are GNS which are used for many articles. I have scripts to convert them to wikipedia templates, we can also convert them to rdf. All my data is CC-by-sa compatible and should not be a problem. On Mon, Aug 6, 2012 at 7:14 AM, Pablo N. Mendes < > wrote: Well I am wiling to do this right. The things that are interesting are : GNS/GEOnames places and things, they are notable and large enough. We need to make sure that they are all in wikipedia. This will be useful just for looking for names, i hope that we can get all the name variants into the database and make them usable. uHi Mike, TheDataHub.org was formerly known as CKAN.net, initially named after the software package that powers it (CKAN) and is mostly a catalog of dataset metadata, but they also offer data hosting. See: They are supporters of the LOD community, so it could be an option. I also think it would be worth pinging the LinkedGeoData folks before deciding on anything. Cheers Pablo On Aug 6, 2012 9:25 AM, \"Mike Dupont\" < > wrote: Hi Mike, TheDataHub.org was formerly known as CKAN.net, initially named after the software package that powers it (CKAN) and is mostly a catalog of dataset metadata, but they also offer data hosting. See: wrote: uHi Pablo, I added this on the agenda for the MLODE f2f discussion session: We can discuss it now on the list and track/collect results in the etherpad as input. Sebastian Am 06.08.2012 09:14, schrieb Pablo N. Mendes: uHi all, +1 I will notify Claus today, if I meet him. Sebastian uOn Mon, Aug 6, 2012 at 8:36 AM, Pablo N. Mendes < > wrote: Ok fine, I am open to this. I will start to do some simple learning on the standard procedures for dbpedia, thanks mike" "About Giuseppe's message (TellMeFirst APIs available for developers)" "uHi all, Friends forwarded me Giuseppe Futia's message below and I thought I'd comment it. First of all, compliments to the TellMeFirst team! I very much like the way they present their results. Second and foremost, I realize that Italy is becoming a fertile land for semantic technologies and APIs based on DBPedia and Linked Open Data. Along with TellMeFirst, we can count our company (Machine Linking API, ( mature to absorbe such technology: how could we imagine a joint way to promote semantic applications in the Italian market? Best," "Links in Infobox and DBPedia" "uHello there, I am kind of new to DBPedia so I just read the documentation provided at home as well as the article \"DBpedia - A Crystallization Point for the Web of Data\". As far as I understand, infoboxes are used to \"export\" wikipedia content to DBPedia. So I took a look at infobox here includes links to UniProt proteins, then I went to related to UniProt in there, maybe some thing like rdfs:seeAlso, but it is not. So, what kind of information from the infoboxes is actually included into DBPedia? Why links in infobox are not in DBPedia? Thansk in advanced for any help you can provide me in this regard. Cheers, Leyla Hello there, I am kind of new to DBPedia so I just read the documentation provided at home as well as the article 'DBpedia - A Crystallization Point for the Web of Data'. As far as I understand, infoboxes are used to 'export' wikipedia content to DBPedia. So I took a look at Leyla" "Google Squared" "uHi all, Google just launched it's Squared [1] search application, which arranges search results as tables and seems to be largely based on extracting information from Wikipedia (although Google claims it uses the whole Web, which according to my experiences seems to be only rarely the case). I wonder whether they also used some kind of infobox extraction - which I suspect due to the pretty good quality of the results. Cheers, Sören [1] uBut it is still very buggy: What does Bossam, a type of Korean cuisine have to do with the Semantic Web? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Thu, Jun 4, 2009 at 3:57 PM, Sören Auer < >wrote: uJuan Sequeda wrote: Google is still working through text rather than entity graphs distilled from the pages in their indexes. And until they do this, they will not be able to achieve what we've been demonstrating for eons re: \"Entity Properties\" for disambiguated search (aka. Find). Another major flaw is that they don't expose identifiers for their results let alone using HTTP URIs as identifiers. What could interesting (becuase squared simply isn't), is if their URIs supported parameters for formats, and via this option they expose alternative representations of search results. This is all that is required for Google to be a more constructive Linked Web data provider. Kingsley uHi Sören, Sören Auer wrote: Actually, I think this work derives more from Alon Halevy's stuff and WebTables [1]. IMO this release is extremely significant and will pave the way to attribute characterizations for virtually any instance type imaginable. In fact, what they already have is massively impressive, though with false positives. This tsunami has been foreseeable for some time [2]. Thanks, Mike [1] [2] uMike Bergman wrote: Mike, Alon knows about Identifiers (he refers to them as GUIDs in his many writings about dataspaces and related topics), so why won't Google expose their GUIDs? Even if the GUIDs aren't HTTP URIs? Either way, the potential value remains immense. Until they expose the identifiers of the rows in those tables (presentation) or at least allow negotiation of search results representations, I will struggle to qualify their current release as impressive, in any shape or form :-) Kingsley" "topical_concepts is gone" "uThis is posted as > 2. skos:subject as used in topical_concepts is not good, since there's no such property. However, now the data is gone from dbpedia.org: this query returns only a few thousand. `select * {?x skos:subject ?y}` But the dataset This one finds no topical link: select * {dbcat:Programming_languages ?p dbr:Programming_language } And could someone explain where this skos:related came from? dbr:Category:Programming_languages skos:related dbr:Category:Programming_language_topics IMHO the topical links are quite important for some classification tasks. E.g. WiBitaxonomy could use them strongly to tie up the page vs category hierarchy." "What's wrong with this SPARQL query?" "uHi all, can you figure out why the following SPARQL query does not return any results on on PREFIX xsd: PREFIX dbpedia-owl: SELECT ?person WHERE{ ?person dbpedia-owl:birthDate \"1967-08-21\"^^xsd:date . } I suspect it's a problem of the \"official\" endpoint. Any idea about that? cheers, roberto uOn 9/14/11 1:14 PM, Roberto Mirizzi wrote: Short answer: DBpedia 3.7 has the property: . LOD Cloud Cache: . Detailed answer coming. uOn 9/14/11 1:14 PM, Roberto Mirizzi wrote: Roberto, I dropped a more detailed response on G+ at: includes revises SPARQL queries that leverage inference context so that property name disparity no longer messes up the SPARQL query results. One problem we've discovered is that our proxy settings are skewing the results. There are also some timezone issues that resulted from doing DBpedia dataset QA in one timezone and uploading the finished product to another, hence the tweaks to your original query re. timezone handling. uIl 14/09/2011 12:26, Kingsley Idehen ha scritto: Hi Kingsley, thank's for your short answer and for the long one. :-) Anyway, concerning to the short answer, if I ask a similar query on both endpoint, I get some results in both cases: PREFIX xsd: PREFIX dbpedia-owl: SELECT * WHERE{ ?person dbpedia-owl:birthDate ?d . } And if I go to dbpedia-owl:birthDate and dbpprop:dateOfBirth. Again, if I try the following query on both endpoints, I get some results on lod.openlinksw.com/sparql and no results on dbpedia.org/sparql: PREFIX xsd: PREFIX dbpprop: SELECT ?person WHERE{ ?person dbpprop:dateOfBirth \"1967-08-21\"^^xsd:date . } In other words, ?person dbpedia-owl:birthDate ?d . returns results on both ?person dbpedia-owl:birthDate \"1967-08-21\"^^xsd:date . no results on dbpedia ?person dbpprop:dateOfBirth ?d . returns results on both ?person dbpprop:dateOfBirth \"1967-08-21\"^^xsd:date . no results on dbpedia cheers, roberto uOn 9/15/11 3:53 PM, Roberto Mirizzi wrote: Inference context applied to both data spaces should harmonize your results :-)" "URIs vs. other IDs (Was: New user interface for dbpedia.org)" "uHi Kingsley, We are getting a bit off-topic here, but let me answer briefly On 07.02.2015 21:36, Kingsley Idehen wrote: That's not my point (I know the difference, of course). Wikidata stores neither Wikipedia URLs nor DBpedia URIs. It just stores Wikipedia article names together with Wikimedia site (project) identifiers. The work to get from there to the URL is the same as the work to get to the URI. Storing either explicitly in another property value would only introduce redundancy (and potential inconsistencies). In a Linked Data export you could easily include one or both of these URIs, depending on the application, but it's not so clear that doing this in a data viewer would make much sense. Surely it would not be useful if people would have to enter all of this data manually three times. On that note, is it the current best practice that all linked data exports include links to all other datasets that contain related information (exhaustive two-way linking)? That seems like a lot of triples and not very feasible if the LOD Web grows (a bit like two-way HTML linking ;-). Wouldn't it be more practical to integrate via shared key values? In this case, Wikipedia URLs might be a sensible choice to indicate the topic of a resource, rather than requiring all resources that have a Wikipedia article as their topic to cross link to all (quadratically many) other such resources directly. I would be curious to hear your take on this. Sure, but you are confusing the purpose of URIs with the underlying technical standard here. People use identifiers to refer to entities, or course, yet they do not use identifiers that are based on the URI standard. We both know about the limitations of this approach, but that does not change the shape of the IDs people use to refer to things (e.g., on Freebase, but it is the same elsewhere). Usually, if you want to interface with such data collections (be it via UIs or via APIs), you need to use their official IDs, while URIs are not supported. This is also the answer to your other comment. You are only seeing the purpose of the identifier, and you rightly say that there should be no big technical issue to use a URI instead. I agree, yet it has to be done, and it has to be done differently for each case. There is no general rule how to construct URIs from the official IDs used by open data collections on today's Web. A related \"problem\" is that most online data sets have UIs that are much more user friendly than any LOD browser could be based on the RDF they export. There is no incentive for users to click on a LOD-based view of, say, IMDB, if they can just go to the IMDB page instead. This should be taken into account when building a DBpedia LOD view (back on topic! ;-): people who want to learn about something will usually be better served by going to Wikipedia; the target audience of the viewer is probably a different group who wants to inspect the DBpedia data set. This should probably affect how the UI is built, and maybe will lead to different design decisions than in the Wikidata browser I mentioned. Markus uOn 2/7/15 6:07 PM, Markus Kroetzsch wrote: Markus, Cutting a long story real short. Yes, you have industry standard identifiers, ditto HTTP URI that identify things in regards to Linked Open Data principles. You simply use relations such as dcterms:identifier (and the like) to incorporate industry standard identifiers into an entity description. Even better, those relations should be inverse-functional in nature. That's really it. DBpedia Identifiers (HTTP URI based References) and Industry Standard Identifiers (typically literal in nature) aren't mutually exclusive. Getting back on topic, reasonator is a nice UI. What it lacks, from a DBpedia perspective, is incorporation of DBpedia URIs which is an issue the author of the tool assured me he will be addressing, as a high priority. uOn 2/8/15 11:28 AM, Kingsley Idehen wrote: Follow-up in regards to the above, our biggest concern boils down to dealing with the following challenges, which highly impact UI and UX: 1. replacing URIs with object of certain annotation oriented relations (rdfs:label, skos:prefLabel, skos:altLabel etc) 2. page results" "Lookup's Keyword Search API is returning no results" "uHello: The Keyword Search API has been returning a blank screen since 2013-04-07. Here is an example url: The Prefix Search API is performing nicely, however. Thanks, uHi there: The Keyword Search API is working this morning. Thanks to whoever fixed it!" "Contd: RelFinder - Version 1.0 released" "uSteffen Lohmann wrote: Phillip, Timo, Steffen, I forgot to add the following to my last mail. Re. Virtuoso option, do add an option for Inference Rules i.e., owl:sameAs (on | off), options for users to refer existing rules etc The net effect of this is that Relfinder will also be able to use smushing to enhance what it currently offers. To see how we do this look at one of the following, post completion of a facet/pivot filter sequence: 1. 2. When you describe an entity, note the \"Settings\" tab, this is where you enable: 1. owl:sameAs inference context 2. select an existing custom inference rule (e.g. b3sifp which shows custom use of foaf:name as an IFP). Again, these are just options that simply augment the slot you currently have for Virtuoso :-)" "Dbpedia lookup results with the word ''Category"." "uHi! I'm using that code: to send words and get candidate URIs from the DBpedia Lookup. The problem is: all the URIs come with the word \"Category\", for example: For the word \"Berlin\" it returns: instead of If I put the first URI (the one with the \"Category\") on the browser, it doesn't show me the page corresponding to the subject \"History_of_Berlin\", it returns me a page that contains a list of links and where I can find the link to \"History_of_Berlin\". But If I put the second URI (the one without \"Category\") it returns me the page corresponding to the subject \"History_of_Berlin\". How could I AVOID having these URIS WITH \"CATEGORY\" being returned from Lookup? Thank you! Hi! I'm using that code: you!" "Ontology Classes" "uHi, Does anybody know what's happened to Ontology Classes page [1]? I don't see some important classes there anymore (e.g. Work [2]). [1] [2] Work uHi Sergey and All, Yes, we are aware of the problem and are working on it - will be fixed soon. Some new mappings between DBpedia and another ontology have caused this issue. Best, Volha On 7/14/2014 10:59 AM, Sergey Skovorodkin wrote: uHi Sergey, I deployed a fix to the server and now all classes should be listed again. Best, Daniel On 7/14/2014 11:10 AM, Volha Bryl wrote:" "how to get all "?topic rdf:type dbpedia-owl:ChemicalCompound" ?" "uHi all, I have looked at the download site, but could not figure out which link to use for all data, as the table only seems to give subset viewsand, when I use the the SPARQL end point, I only get the chemical compounds down to about 'C', instead of the full data set What am I doing wrong here? Egon uEgon, Do you want to know the *URI* of everything that's of type :ChemicalCompound? Or do you want the *data* about these things? All the data or just some of it (titles, abstracts, infoboxes, ?) Richard On 14 Jan 2009, at 09:57, Egon Willighagen wrote: uOn 1/14/09, Richard Cyganiak < > wrote: Eventually, I like to get hold of all the datahaving all the URIs is nice, as I can wget them one-by-one, but not so efficient, I guess Egon uHi, Egon uOn 1/14/09, Ted Thibodeau Jr < > wrote: I did notice not getting everythingthat's actually what I reportedI get everything from A to about 'C' Makes sense. uOn 1/14/09, Ted Thibodeau Jr < > wrote: PREFIX rdf: PREFIX dbpedia-owl: select distinct ?c where {?c rdf:type dbpedia-owl:ChemicalCompound} See: Should not be overly many hits The SPARQL end point now returns Egon uOn Jan 14, 2009, at 11:19 AM, Egon Willighagen wrote: See: Section 3 tells you about each of the downloads in Sections 1 and 2. (Yes, this presentation leaves something to be desired. Eventually, I'll figure out how to rework the templates for this auto-generated page.) Be seeing you, Ted uHi Egon, On Wed, Jan 14, 2009 at 5:19 PM, Egon Willighagen < > wrote: I think the DBpedia SPARQL endpoint limits the query results to 1000 entries, but you may use multiple queries with LIMIT and OFFSET to retrieve more: select distinct ?c where {?c rdf:type dbpedia-owl:ChemicalCompound} LIMIT 1000 OFFSET 0 select distinct ?c where {?c rdf:type dbpedia-owl:ChemicalCompound} LIMIT 1000 OFFSET 1000 select distinct ?c where {?c rdf:type dbpedia-owl:ChemicalCompound} LIMIT 1000 OFFSET 2000 [and so on until no results are returned] Julius uEgon uHi, Julius uOn 14 Jan 2009, at 15:34, Egon Willighagen wrote: Download this one: Then do a grep: grep \"/ontology/ChemicalCompound\" types-mappingbased.nt This gives you the 4390 rdf:type triples that involve ChemicalCompounds. Fetching them all with wget is probably more efficient than dealing with the several 100M triples contained in the entire dataset. Best, Richard uOn Wed, Jan 14, 2009 at 5:46 PM, Ted Thibodeau Jr < > wrote: Right, thanks for pointing this out! Julius" "Core Datasets in other languages" "uHi, Are you panning on releasing the remaining core datasets on all languages? I could really use the Articles Categories for Portuguese, as well as some redirection and disambiguation info for the PT Wikipedia as well. If you need someone who has Portuguese as a mother tongue, I can help you. Cheers, u+1. I'm aware that this may be difficult to achieve, but it could be worth some effort: For German, there are only 422,819 labels available in DBpedia for matching with other vocabs (versus 2,866,994 labels plus 3,117,024 redirects for English). The German Wikipedia has 864,699 pages and more than 609,000 redirects, so only a small part of the information available there can be used in DBpedia. Are there some ideas out there how foreign-language-only entries could be added to the DBpedia dataset? (and how we could get rid of foreign-language entries which are connected to the English version afterwards and therefore are sameAs the first class (English) DBpedia entries?) Cheers, Joachim uHello Joachim, I agree that there's a demand for URIs for foreign-language-only entries. One approach *could* be to create DBpedia namespaces for every language, such as languages with owl:sameAs links. So But this would mean minting a lot of new URIs, and I'm not completely convinced of that approach. But I don't really see another solution for the problem of URIs for foreign-language-only entries. We can't add the foreign entities to the main dbpedia namespace, because that would mess things up completely, and there certainly are a lot of articles with the same name but referring to different concepts across languages. Note that we can represent data extracted from infoboxes from other languages by using e.g. Named Graphs, so this is only about those entities which are *not* in the English Wikipedia as well. So I'd like to ask the community: 1) Do you like the above approach? 2) How deeply do you need URIs for non-english concepts (those which can't be added to the English Wikipedia) Cheers, Georgi uGeorgi Kobilarov wrote: Georgi, We have to mint new URIs, and there is nothing wrong with that; as along as each entity has an \"rdf:type\" that links into one or more reasonable taxonomies. The rest of the work is for inference and reasoning technology to handle. Language should not narrow the breadth or depth of potential conceptualization. Kingsley uOn 13 Feb 2009, at 18:08, Georgi Kobilarov wrote: I don't need the new URIs, but I think that the approach you described for minting language-specific identifiers is sound and the best that can be done. A single unified namespace would be better in theory, but I don't see any pragmatic possibility for managing such a namespace for DBpedia. Richard uThanks for the feedback so far! Seems like we should/will go for the approach of minting language-specific URIs for DBpedia resources and interlink them with owl:sameas. uThis would be a great means for using all these URIs. One thing I am not sure about, because I'm quite new in the Semantic Web: DBpedia is about things (persons, etc.). It would be an advantage to have one prefered URI for these things across all languages. Therefore, if a link between the English and the Italian wikipedia pages for an actor is established, and derived from this a DBpedia owl:sameAs, it would be nice to be able to express that the prefered URI is the one from dbpedia.org, not from it.dbpedia.org. This could also help if circles across languages should occur (eg. from redirects). I suppose that the problem is not specific to dbpedia, and maybe there are best practices in dealing with it I am not aware of. Cheers, Joachim" "Announcing the Ontology2 Edition of DBpedia 2015-10" "uPublic SPARQL endpoints, being a shared resource, often lack the performance and reliability that people need to do their projects. On the other hand, it can be a big hassle to get the required hardware and software together to work with large data sets in a database. That's why we've created the Ontology2 Edition of DBpedia 2015-10, available on the AWS Marketplace: This product consists of perfectly matched hardware, software and data to create your own private DBpedia SPARQL endpoint in minutes. With 50% more facts than the DBpedia SPARQL endpoint, the 2015-10 edition of DBpedia is richer than previous DBpedia distributions. More information about this product and how to use it can be found here: We invite you to try it out!" "infobox_properties_en.nt file of DBPedia 3.8" "uHi All, I am trying to check if infobox_properties_en.nt has valid RDF syntax. I found some errors, and that is why I could not able to load it into triple store. What would be the solution, if I want to load this file successfully? Thanks, Yashpal Hi All, I am trying to check if infobox_properties_en.nt has valid RDF syntax. I found some errors, and that is why I could not able to load it into triple store. What would be the solution, if I want to load this file successfully? Thanks, Yashpal uWhat are the errors? On 11 April 2013 16:55, Yashpal Pant < > wrote: u(We should include the list in our replies.) This is indeed an error. We'll try to fix this in the next release. At the moment, you could try to edit the file and replace the type \"#int\" by \"#integer\". On 12 April 2013 05:26, Yashpal Pant < > wrote: uHmI just checked the code and it looks like we already fixed this a long time ago: Jun 25, 2012 xsd:int -> xsd:integer But although we ran the extraction a month later, the file still contains xsd:int: curl -s | bunzip2 | less # started 2012-07-23T13:07:06Z \"299\"^^ . I don't know what went wrong. We should run InfoboxExtractor on a current Wikipedia dump and see what happens. JC On 12 April 2013 06:09, Jona Christopher Sahnwaldt < > wrote: uHi All, I could not load  infobox_properties_en.nt into sesame triple store, as it shows that this file is not valid RDF. If this file is not valid RDF, how can DBPedia people could able to load the file in the DBPedia SPARQL Endpoint available at:  Please suggest me a way, I want to load this file. Thnaks, Yash From: Jona Christopher Sahnwaldt < > To: Yashpal Pant < > Cc: dbpedia-discussion < > Sent: Sunday, April 14, 2013 3:08 PM Subject: Re: [Dbpedia-discussion] infobox_properties_en.nt file of DBPedia 3.8 HmI just checked the code and it looks like we already fixed this a long time ago: Jun 25, 2012 xsd:int -> xsd:integer But although we ran the extraction a month later, the file still contains xsd:int: curl -s | bunzip2 | less # started 2012-07-23T13:07:06Z \"299\"^^ . I don't know what went wrong. We should run InfoboxExtractor on a current Wikipedia dump and see what happens. JC On 12 April 2013 06:09, Jona Christopher Sahnwaldt < > wrote: uHi Yash, Can you tell us which triples cause the problem and we will try to fix them in the next release. In the meantime you can remove the wrong triples and load the rest of the dump. Best, Dimitris On Thu, May 2, 2013 at 9:58 AM, Yashpal Pant < > wrote:" "RDF URI References in DBPedia N-Triples dump" "uDear all, currently i'm trying to load triples from the DBPedia 3.7 dump (in particular de/page_links_de.nt). The parser is implemented on the definitions of  (a)  (b)  (c)  (d)  (e) In (d)    [] and would produce a valid URI character sequence (per RFC2396 [URI], sections 2.1) representing an          absolute URI with optional fragment identifier when subjected to the encoding described below. [] the decoding is done by percent-octet encoding. So, a valid RDF URI doesn't contain [specified by (a) 2.4.1 and updated by (b)] following characters                       unwise = \"{\" | \"}\" | \"|\" | \"\\" | \"^\" | \"`\" But a lot of triples seems to be invalid: . . around of 11k triples (out of 29900k) has such problems also lot of triples consists of non US-ASCII characters (currently encoded in preprocessing step) Are these problems induced by processing Wikipedia? BTW the NTriple specification should not be updated ( Regards, Andreas SMS schreiben mit WEB.DE FreeMail - einfach, schnell und kostenguenstig. Jetzt gleich testen! ?mc=021192 uDear Andreas you may find answers to similar questions in the mailing list archive Best, Dimitris On Mon, Oct 10, 2011 at 5:31 PM, Andreas Gebhardt < > wrote: uDear Dimitris, thank you - i found the reference to the invalid filenaming - Turtle instead of NTriple Best, Andreas SMS schreiben mit WEB.DE FreeMail - einfach, schnell und kostenguenstig. Jetzt gleich testen! ?mc=021192" "New Entity Descriptor Document Formats" "uAll, You may have picked this up from my tweets earlier today. We can now produce Atom (using OData's Atom+Feed dialect) based Descriptor Documents for Entities in DBpedia, LOD Cloud Cache, and any other Virtuoso based RDF store. Implications: Ultimately (once we iron some issues with existing 3rd party OData clients), transparent OData application access and exposure for all Descriptor Docs of all the LOD Cloud Cache and DBpedia Entities. Naturally, we encourage other RDF store providers to emulate what we've done re. building a bridge to the burgeoning OData realm (publication and consumption). Business of Linked Data Note: Microsoft has a market place for OData sets in place called \"Dallas\". In a nutshell, they want to make the process of curating and maintaining data sets sustainable via a compensation system. All you do is get your stall in their pre-furnished (till included) Data Mart :-) Links: 1. cKQAEV" "get only certain data out DBpedia 3.8 Downloads" "uHi, I have downloaded certain files for example instance_types_en.nt. I would like to extract specific information for example everything with or . I can use Excel but it is time consuming as the files are too big to put them in Excel. Also, I was thinking of processing the files by parsing the strings and save the information required in other file. However, my programming skills are rusty to say the least. Is there a way to get these specific datasets from dbpedia.org or do you know other website that can provide them? Thank you, Mike Hi, I have downloaded certain files for example instance_types_en.nt. I would like to extract specific information for example everything with < Mike uHi Mike, On 02/13/2013 03:21 PM, The Guy wrote: You can either use UNIX commands or SPARQL queries against one of the SPARQL endpoints [1, 2], to get the information you are looking for. * UNIX: grep ' ' instance_types_en.nt > output.nt where output.nt is the output file. * SPARQL: SELECT ?s WHERE {?s a } limit 1000 [1] [2] sparql uThank you Mohamed. Exactly what I needed. Is there any in built limitation? I used:     SPARQL: SELECT ?s WHERE {?s a } and I got a suspectly low and round number of 50,000 :) . For example Abraham_Lincoln is not in the CSV file I downloaded. Also, is there any 'magic' statement to get the short abstracts from 'short_abstracts_en.nq' for the 'categories' selected in 'instance_types_en.nt' be it 'Book', 'Place', etc.? Cheers, M From: Mohamed Morsey < > To: The Guy < > Cc: \" \" < > Sent: Wednesday, February 13, 2013 6:49:01 AM Subject: Re: [Dbpedia-discussion] get only certain data out DBpedia 3.8 Downloads Hi Mike, On 02/13/2013 03:21 PM, The Guy wrote: Hi, You can either use UNIX commands or SPARQL queries  against one of the SPARQL endpoints [1, 2], to get the information you are looking for. * UNIX: grep ' ' instance_types_en.nt > output.nt         where output.nt is the output file. * SPARQL: SELECT ?s WHERE {?s a } limit 1000 [1] [2] sparql u#!/bin/bash INSTANCE_TYPES_FILENAME=$1 SHORT_ABSTRACTS_TYPES_FILENAME=$2 while read line do first_part=( $line ) grep $first_part $SHORT_ABSTRACTS_TYPES_FILENAME done < $INSTANCE_TYPES_FILENAME" "social participation data and ontologies in DBpedia" "uDear users and developers, We made some social participation OWL ontologies and linked data representation of social participation platforms. Some of these developments are summarized in this article: , this UNDP report: and links therein. I thought a number of times in better integrating these developments to the LOD cloud and have the questions: *) Is this interesting to the DBpedia community? More specifically, is there any chance we can integrate the ontologies and the data to DBpedia? *) Should I think about a GSoC student proposal on this social participation integration conceptualizations and data to DBPedia? My apologies for not making this contact earlier, but I handed my doctorate dissertation a few days ago and could not concentrate as needed until now. Some info about my research and software development efforts are gathered here: Anyway, this topic might be of use for the DBPedia community as a whole and for developments outside GSoC. PS. I visited the \"get involved\" pages in wiki.dbpedia.org and am willing to help in a number of ways. I should be starting a postdoctorate soon and might emphasize the semantic web as we find suitable. Indeed, Brazillian research funding agencies are having some difficulties and I can apply for a DBpedia internship if you find it reasonable. The senior researchers of IFSC/USP (physics) and ICMC/USP (CS and mathematics) which I am related to are willing to formalize a partnership with other organizations. Best Regards, Renato Fabbri uHi Ghislain, On Thu, Mar 30, 2017 at 5:41 AM, Ghislain Atemezing < > wrote: These are described in section 4.1 of the article: Ex: and I published a preliminary version of this data in data hub: I might publish the up-to-date data if we (you included) find it reasonable. Yes! And the URIs do not deference by HTTP! We had them online in infrastructure from the USP cloud, but as Brazilian public entities are cutting funds, we lost it. If you know of any way I can put them online again, please drop me a line. + Do you have some links from your data to DBpedia dataset? Not anymore. I wanted to focus on the data consistency and links to external ontologies and databases should be brought again to life. Yes. Thanks. Maybe I should do this after the links to external resources are restored? PS. if you know of other discussion forums where I should present and mature there contributions, please tell me. Best, Renato uHi Renato, Good to hear that you are doing stuff on ontologies. I can’t help for DBpedia but as a consumer, can I ask you these small questions: + could you point us to the data and ontologies URIs? + Do you consider publishing your data on the data hub [1] to make that visible to the rest of the community (not only DBPedians)? + This link + Do you have some links from your data to DBpedia dataset? + Could you please submit your ontologies to LOV [2] so that others can reuse? Best, Ghislain [1] > u0€ *†H†÷  €0€10  `†He uThanks very much, Kingsley, your suggestions are noted. I wrote a GSoC proposal: Please feel free to comment at will. Best regards! On Thu, Mar 30, 2017 at 9:44 AM, Kingsley Idehen < > wrote:" "Disambiguating nutritional facts infoboxes" "uHi, I am looking at DBPedia for nutritional facts. I tried running a query: select distinct ?subject where { ?subject dbpprop:carbs ?value } limit 100 And I spotted a few issues (there might be more, but I stopped looking): 1. Take dbpedia:Squab_(food) as an example. The infobox on Wikipedia states \"There is some variation in nutritional content depending on the breed of utility pigeon used for squabbing.\". It is copied into DBPedia as a dbpprop:note, and I am not sure how to automatically figure out whether it relates to the infobox or not. Also I am missing the citation from Wikipedia. 2. Take dbpedia:Coconut as an example. The Wikipedia article has two infoboxes related to nutrition, one being \"Coconut-inner edible solid part, raw (fresh kopra)\", another \"coconut water\". In DBPedia all the values are collected, so every property have two values and I think it is almost impossible to figure out which value relates to which infobox. Furthermore the only place I see the name of the two infoboxes is the dbpprop:name, but it also contains two extra unrelated values. 3. Sometimes there is a source (e.g. USDA Nutrient Database). I can look at the property dbpprop:sourceUsda for an ID, but sometimes it just contains 1 if the infobox only links to the USDA search website in general and not the actual entry. Occasionally the value is just wrong, as in dbpedia:Orange_juice, where the USDA ID points to \"classic sirloin steak (10 oz)\". Maybe it is just because it was corrected on Wikipedia after the last import? 4. Sometimes there is a note saying \"Percentages are roughly approximated using US recommendations for adults\" including a link with further information. This information is not copied to DBPedia. How could this be improved. It might involve a lot of work, and I think the following points are important to consider: A. Create an ontology that corresponds to the combined ways of using the nutritional facts infobox. Create a resource for each infobox. B. Each nutritional facts resource must be named accordingly. It should contain notes if the values are uncertain in some way. It should reference sources if available. It should contain the percentage values and a note about how the percentage values are calculated including possible references for further information. C. Link each food/drink resource to one or more nutritional facts resource. Have I overlooked something? Is there any related work regarding this topic? Any comment is appreciated. Cheers, Bjarke Walling Hi, I am looking at DBPedia for nutritional facts. I tried running a query: select distinct ?subject where { ?subject dbpprop:carbs ?value } limit 100 And I spotted a few issues (there might be more, but I stopped looking): 1. Take dbpedia:Squab_(food) as an example. The infobox on Wikipedia states 'There is some variation in nutritional content depending on the breed of utility pigeon used for squabbing.'. It is copied into DBPedia as a dbpprop:note, and I am not sure how to automatically figure out whether it relates to the infobox or not. Also I am missing the citation from Wikipedia. 2. Take dbpedia:Coconut as an example. The Wikipedia article has two infoboxes related to nutrition, one being 'Coconut-inner edible solid part, raw (fresh kopra)', another 'coconut water'. In DBPedia all the values are collected, so every property have two values and I think it is almost impossible to figure out which value relates to which infobox. Furthermore the only place I see the name of the two infoboxes is the dbpprop:name, but it also contains two extra unrelated values. 3. Sometimes there is a source (e.g. USDA Nutrient Database). I can look at the property dbpprop:sourceUsda for an ID, but sometimes it just contains 1 if the infobox only links to the USDA search website in general and not the actual entry. Occasionally the value is just wrong, as in dbpedia:Orange_juice, where the USDA ID points to 'classic sirloin steak (10 oz)'. Maybe it is just because it was corrected on Wikipedia after the last import? 4. Sometimes there is a note saying 'Percentages are roughly approximated using US recommendations for adults' including a link with further information. This information is not copied to DBPedia. How could this be improved. It might involve a lot of work, and I think the following points are important to consider: A. Create an ontology that corresponds to the combined ways of using the nutritional facts infobox. Create a resource for each infobox. B. Each nutritional facts resource must be named accordingly. It should contain notes if the values are uncertain in some way. It should reference sources if available. It should contain the percentage values and a note about how the percentage values are calculated including possible references for further information. C. Link each food/drink resource to one or more nutritional facts resource. Have I overlooked something? Is there any related work regarding this topic? Any comment is appreciated. Cheers, Bjarke Walling uHi Bjarke, I work a bit on Food & Drink (edamam.com before, now Europeana Food & Drink which is not about nutrition but anyway). For edamam we processed recipes. We targeted USDA Standard Reference (is this the same as USDA Nutrient Database?). - We mapped normal food names (e.g. Steak) to one of the hundreds of steaks on USDA using a FreeBase dataset - We mapped ingredient lines (especially unit and number) using heuristics and duct tape. Oh it was so much fun! I haven't looked at nutrition info on Wikipedia, but knowing how fiddly is this info, I doubt it'd be structured and entered properly in every article to enable good extraction. You make a good point with Coconut: why put two foods on one page? AFAIK, DBpedia can't extract twice from the same page. I think you're better off looking at specialized sites. 1. Mostly about nutritional values, labels, countries of ready foods. Has a great number of varieties and brands. Mostly EN & FR 2. - info - very well done, but only Russian, from - eg 3. - info - food: 66k recipes, 22k persons, classified foods (eg seafood, spices) - foodista: 32k recipes - uses this ontology, let me know if you find a RDF file Even if not, you can model something similar in DBpedia; or use hRecipe or schema.org/Recipe" "Edit permission request" "uHi, I'd like to start adding to your mappings. My main language is German, but I also speak a bunch of other languages this means I would like to work like \"Cities in language A, B, C etc. since it is somewhat repeating and will be easier for me at the very beginning just to deal with one type of template. [[User:Jimregan]] already showed me first steps. So I can do things expanding step by step. So the next question that comes up is: how to add further languages? For now I can (and will mainly) work on German which is already present as a namespace. Other possible Wikipedia-languages: Italian French Neapolitan Piedmontese Bavarian Spanish Portuguese For me the less resourced ones are the most relevant to work on, since our association is centered on less resourced cultures. My main aim will not be to edit loads of mappings, but do small parts regularly and to attract mainly people to this project, because if many do one part :-) well we all know how this works. So I hope you will get me edit rights and I can start to do my first steps online. Thanks and have a nice week-end! Bina Hi, I'd like to start adding to your mappings. My main language is German, but I also speak a bunch of other languages this means I would like to work like 'Cities in language A, B, C etc. since it is somewhat repeating and will be easier for me at the very beginning just to deal with one type of template. [[User:Jimregan]] already showed me first steps. So I can do things expanding step by step. So the next question that comes up is: how to add further languages? For now I can (and will mainly) work on German which is already present as a namespace. Other possible Wikipedia-languages: Italian French Neapolitan Piedmontese Bavarian Spanish Portuguese For me the less resourced ones are the most relevant to work on, since our association is centered on less resourced cultures. My main aim will not be to edit loads of mappings, but do small parts regularly and to attract mainly people to this project, because if many do one part :-) well we all know how this works. So I hope you will get me edit rights and I can start to do my first steps online. Thanks and have a nice week-end! Bina vmf.i-iter.org" "DBpedia-Cyc linkage" "uHi all, The commonsense knowledge base Cyc [1] or OpenCyc [2] (when compared to DBpedia) seems to follow a rather top-down approach – first more abstract concepts and entities are represented and later Cyc started to include also more domain knowledge. This seems to be reasonable, since domain knowledge changes faster and there is much more of it. On the other hand, domain knowledge is usually, what people need to solve real problems within their domains. DBpedia contains primarily domain knowledge, hence a combination of both – Cyc and DBpedia – could really be a winning team. Together with the committed OpenCyc community we produced a first DBpedia-Cyc linkage, which is now available as a DBpedia dataset from the downloads section [3]. The dataset will soon also be loaded into the DBpedia SPARQL endpoint and made available as linked data. More information about the linkage can be also found at: OpenCyc" "dbpedia.org sample query not working" "uI'm trying to learn how to query the datasets but got a hard time getting my own queries working. I do not understand for example why that query below, linked from the Datasets page on the site, does not work (no result): SELECT ?subject ?label ?released ?abstract WHERE { ?subject rdf:type . ?subject dbpedia2:starring . ?subject rdfs:comment ?abstract. ?subject rdfs:label ?label. FILTER(lang(?abstract) = 'en' && lang(?label) = 'en'). ?subject dbpedia2:released ?released. FILTER(xsd:date(?released) < '2000-01-01'^^xsd:date). } ORDER BY ?released LIMIT 20 Any idea? Want to do more with Windows Live? Learn “10 hidden secrets” from Jamie. { margin:0px; padding:0px } body.hmmessage { FONT-SIZE: 10pt; FONT-FAMILY:Tahoma } I'm trying to learn how to query the datasets but got a hard time getting my own queries working. I do not understand for example why that query below, linked from the Datasets page on the site, does not work (no result): SELECT ?subject ?label ?released ?abstract WHERE { ?subject rdf:type < Now uI fiddled with the query, simplifying it until I found one that brought results back through formatted here with whitespace, so it's a bit more clear what's being queried" "Release date for infobox films not parsed" "uHi, Folks, I am looking at the parsed results for the mapped categories here: As you can release date for the movie which uses infobox film template is populated with value 1996, however this value is not parsed in the mapped fields here: I looked at the dbpedia template and release date is mapped to dbpedia ontology: Does anyone know why this field is not being extracted? thanks, Mandar Hi, Folks, I am looking at the parsed results for the mapped categories here: As you can release date for the movie Mandar uHi Mandar, The value of the release date for the film you mention is {{Film date|1996|||}} - so, not just the year number. The complex format is very likely to be the reason of why it is not parsed. Cheers, Volha On 2/6/2015 6:47 AM, Mandar Rahurkar wrote: uHi, Volha, Thanks. How were you able to extract that information? When I look at the source the wikipedia page I see template for infobox as Infobox_film: Moreover I downloaded the dbpedia mappings as well as as infobox datasets extracted last year and did not see release date populated for any of the movies. thanks, Mandar On Thu, Feb 5, 2015 at 10:54 PM, Volha Bryl < > wrote: uHi Mandar! Run these queries on First check the raw property dbo:released: PREFIX dbo: PREFIX dbp: PREFIX rdfs: select * {?x a dbo:Film; dbp:released ?rel filter exists {?x rdfs:label ?lab filter(strstarts(?lab,\"Act\"))}} order by ?x limit 100 As you can see many movies have it, but not Actrius. So Volha is right, the problem is that in that movie it's not a plain date. It's in | release = {{Film date|1996|||}} I tried to make a mapping: to extract release year and location (there can be several). But it doesn't extract anything. Maybe templates INSIDE template fields are not used for extraction? Issue: Test cases: If that's the case, we could map it to another date template here: But Volha, can it extract SEVERAL dates from one template? uHi Vladimir, Mandar, The mappings extractor can't handle nested templates: @Dimitris : I know this is on your to do list, any progress so far ? Cheers, Alexandru On Wed, Feb 18, 2015 at 5:49 PM, Vladimir Alexiev < > wrote: uThanks Guys for your comments ! Release data information for April Love (film) is available but not for And if you examine the wikipedia page, they both seem to use nested template: So maybe this is more than one issue? thanks, Mandar On Wed, Feb 18, 2015 at 9:22 AM, Alexandru Todor < > wrote: uHi Mandar :) DBpedia does not handle nested templates. It may work for some specific (simple-enough) templates but it is in no way generalized. That's why consumer-grade projects consuming Wikipedia data either:1) Scrape Wikipedia HTML pages directly: i.e. template interpretation is done by MediaWiki, on wikipedia.com or on dedicated Wikipedia mirrors.2) Set up their own Wikipedia extraction framework, which may interpret templates directly or delegate to MediaWiki using its API. Nicolas. On Wednesday, February 18, 2015 10:56 AM, Mandar Rahurkar < > wrote: Thanks Guys for  your comments ! Release data information for April Love (film) is available but not for  And if you examine the wikipedia page, they both seem to use nested template: So maybe this is more than one issue? thanks,Mandar On Wed, Feb 18, 2015 at 9:22 AM, Alexandru Todor < > wrote: Hi Vladimir, Mandar, The mappings extractor can't handle nested templates: Cheers,Alexandru On Wed, Feb 18, 2015 at 5:49 PM, Vladimir Alexiev < > wrote: Hi Mandar! Run these queries on First check the raw property dbo:released: PREFIX dbo: PREFIX dbp: PREFIX rdfs: select * {?x a dbo:Film; dbp:released ?rel     filter exists {?x rdfs:label ?lab filter(strstarts(?lab,\"Act\"))}} order by ?x limit 100 As you can see many movies have it, but not Actrius. So Volha is right, the problem is that in that movie it's not a plain date. It's in    | release = {{Film date|1996|||}} I tried to make a mapping: to extract release year and location (there can be several). But it doesn't extract anything. Maybe templates INSIDE template fields are not used for extraction? Issue: Test cases: http://mappings.dbpedia.org/index.php/Mapping_en_talk:Film_date If that's the case, we could map it to another date template here: https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/config/dataparser/DateTimeParserConfig.scala#L97 But Volha, can it extract SEVERAL dates from one template? uThanks Nicolas ! :) 1. Scraping rendered wikipedia html pages seems like it would be noisy in terms of data quality. Isn't that so? 2. If we delegate to MediaWiki API, is this option scalable if we had to parse the wikidump on daily basis? thanks, Mandar On Wed, Feb 18, 2015 at 1:11 PM, Nicolas Torzec < > wrote: uRegarding scraping Wikipedia HTML pages:It's a different type of extraction but all the relevant structured data is there (e.g. infobox template name, attribute names and values, categories, etc.) and all Wiki templates have already been interpreted into user-friendly plain-text values, so you don't have to. Regarding delegating to MediaWiki:Wikipedia's MediaWiki Special API has a 1 QPS rule but you can request more than 1 page per call and you only need to do it for pages that have been added or modified. So, depending on your update frequency, it may be do-able. Otherwise you will have to install and maintain your own MediaWiki cluster as a Wikipedia mirror, so you can hit it as hard as you need. Nicolas. On Wednesday, February 18, 2015 3:45 PM, Mandar Rahurkar < > wrote: Thanks Nicolas ! :) 1. Scraping rendered wikipedia html pages seems like it would be noisy in terms of data quality. Isn't that so?2. If we delegate to MediaWiki API, is this option scalable if we had to parse the wikidump on daily basis? thanks,Mandar On Wed, Feb 18, 2015 at 1:11 PM, Nicolas Torzec < > wrote: Hi Mandar :) DBpedia does not handle nested templates. It may work for some specific (simple-enough) templates but it is in no way generalized. That's why consumer-grade projects consuming Wikipedia data either:1) Scrape Wikipedia HTML pages directly: i.e. template interpretation is done by MediaWiki, on wikipedia.com or on dedicated Wikipedia mirrors.2) Set up their own Wikipedia extraction framework, which may interpret templates directly or delegate to MediaWiki using its API. Nicolas. On Wednesday, February 18, 2015 10:56 AM, Mandar Rahurkar < > wrote: Thanks Guys for  your comments ! Release data information for April Love (film) is available but not for  And if you examine the wikipedia page, they both seem to use nested template: So maybe this is more than one issue? thanks,Mandar On Wed, Feb 18, 2015 at 9:22 AM, Alexandru Todor < > wrote: Hi Vladimir, Mandar, The mappings extractor can't handle nested templates: Cheers,Alexandru On Wed, Feb 18, 2015 at 5:49 PM, Vladimir Alexiev < > wrote: Hi Mandar! Run these queries on First check the raw property dbo:released: PREFIX dbo: PREFIX dbp: PREFIX rdfs: select * {?x a dbo:Film; dbp:released ?rel     filter exists {?x rdfs:label ?lab filter(strstarts(?lab,\"Act\"))}} order by ?x limit 100 As you can see many movies have it, but not Actrius. So Volha is right, the problem is that in that movie it's not a plain date. It's in    | release = {{Film date|1996|||}} I tried to make a mapping: to extract release year and location (there can be several). But it doesn't extract anything. Maybe templates INSIDE template fields are not used for extraction? Issue: Test cases: http://mappings.dbpedia.org/index.php/Mapping_en_talk:Film_date If that's the case, we could map it to another date template here: https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/config/dataparser/DateTimeParserConfig.scala#L97 But Volha, can it extract SEVERAL dates from one template? uTry the live extractor, it doesn't have releaseDate: Someone edited the page since Aug 2014. uThanks Nicolas and Vladimir ! On Thu, Feb 19, 2015 at 8:11 AM, Vladimir Alexiev < > wrote: uOn Wed, Feb 18, 2015 at 7:22 PM, Alexandru Todor < > wrote: As I said in another thread, it is very trivial to change but needs testingany volunteers from the community? I can provide an adapted version of the code and also dumps but someone needs to look at the data Cheers, Dimitris uI can help look at the data. Mandar On Fri, Feb 20, 2015 at 3:19 AM, Dimitris Kontokostas < > wrote: uHi Dimitris, I'll subscribe to looking at the data. I'm overdue in releasing a new German DBpedia dump, so I can run new extraction and look at some nested template cases that I already know. Could you also make the extractor log each nested template it finds ? Cheers, Alexandru On Mon, Feb 23, 2015 at 7:26 PM, Mandar Rahurkar < > wrote:" "Getting the hyperlink text of the Wikipedia articles' "External Links"" "uHello discussion group :) I'd like to have a list of *external links* of wiki pages (the ones that lead to external sites, at the end of each article), but also to have *the text of the hyperlink* as it appears in the article. For example, for the 2 External Links at the bottom of the strings \"Official website \" and \"Website for the 2004 Summer Olympic Games \" I know there's an ExternalLinks extractor, but how do I extract the hyperlink text too? Cheers, Omri Hello discussion group :) I'd like to have a list of external links of wiki pages (the ones that lead to external sites, at the end of each article), but also to have the text of the hyperlink as it appears in the article. For example, for the 2 External Links at the bottom of Omri uHi Omri, DBpedia parses link texts, but later doesn't use them (I think). You will have to change the code a little. You could make a copy of ExternalLinksExtractor [1] and adapt it to your needs: go through all the external links and instead of link.destination call link.toPlainText. See ExternalLinkNode [2]. Or maybe even better, add a configuration setting to ExternalLinksExtractor that determines if link texts should also be extracted. What should the RDF triples that you generate look like? Before you start, you should read this page: This way, you can send us a pull request when you are done, and if we like your changes, we can incorporate them and everyone can reap the benefits. :-) Cheers, JC [1] [2] On 3 April 2013 14:17, Omri Oren < > wrote:" "bif:contains text search treated as a prefix" "uI am trying to run this against the public DBPedia SPARQL gateway (text search) This is straight from the website docI get this error: Exception in thread \"main\" com.hp.hpl.jena.query.QueryParseException: Line 1, column 588: Unresolved prefixed name: bif:contains Here is the query string: String sparqlQueryString = \" PREFIX owl: \"+ \" PREFIX xsd: \"+ \" PREFIX rdfs: \"+ \" PREFIX rdf: \"+ \" PREFIX foaf: \"+ \" PREFIX dc: \"+ \" PREFIX : \"+ \" PREFIX dbpedia2: \"+ \" PREFIX dbpedia: \"+ \" PREFIX skos: \"+ \" SELECT DISTINCT ?x ?y\"+ \" FROM \"+ \" WHERE {\"+ \" ?x rdfs:label ?y .\"+ \" ?y bif:contains \\"Berlin\\" .\"+ \" ?x skos:subject ?z .\"+ \" }\"; Why is Virtuoso thinking bif is a prefix? I am trying to run this from Jena because the web interface give me a time out. Thanks much! uI am trying to run this against the public DBPedia SPARQL gateway (text search) This is straight from the website docI get this error: Exception in thread \"main\" com.hp.hpl.jena.query.QueryParseException: Line 1, column 588: Unresolved prefixed name: bif:contains Here is the query string: String sparqlQueryString = \" PREFIX owl: \"+ \" PREFIX xsd: \"+ \" PREFIX rdfs: \"+ \" PREFIX rdf: \"+ \" PREFIX foaf: \"+ \" PREFIX dc: \"+ \" PREFIX : \"+ \" PREFIX dbpedia2: \"+ \" PREFIX dbpedia: \"+ \" PREFIX skos: \"+ \" SELECT DISTINCT ?x ?y\"+ \" FROM \"+ \" WHERE {\"+ \" ?x rdfs:label ?y .\"+ \" ?y bif:contains \\"Berlin\\" .\"+ \" ?x skos:subject ?z .\"+ \" }\"; Why is Virtuoso thinking bif is a prefix? I am trying to run this from Jena because the web interface give me a time out. Thanks much! u u uHi all, I think you are not querying the Virtuoso endpoint. Are you accessing this URI: the issue ;) No bif:contains doesn't have any prefix and is a built-in function of the datastore. Take care, Fred u uThanks for your reply. I am accessing the correct URI It turns out that from Jena, you need to use in angulat bracket to avoid the Jena parser complaining that there is no matching prefix. uSeaborne, Andy wrote: uFirst I would like to say thanks for the replies so farI am trying to learn a lot of things and you all have been very helpful. Here is what i need to understand: 1- bif:contains is not SPARQL or even ARQ but exclusive to openlink virtuoso? 2- What is the standard way of doing a text search on the RDF graphs? (I want to find if all the triples where the URI property contains a certain string) 3- How is bif:contains different than using regex in filter? is using filter the more standard way but slower in performance for example? SELECT ?g WHERE { ?y vcard:Given ?g . FILTER regex(?g, \"searchString\", \"i\") }4- is there a tutorial or some docs on how I can do text search on RDF graphs so I dont have to post to the mailing list if my questions are too trivial? I think learning how to do this is essential as it is a good way to explore a new RDF graph for example. Thanks, Marv uMarvin Lugair wrote: Short answer: yes. Long answer: It invokes a Virtuoso built-in-function (that's what the acronym BIF stands for - historical reasons.) Note that you can call Virtuoso's SQL functions and stored procedures from both WHERE and result patterns. We use the sql: and bif: namespaces for that. See Calling SQL from SPARQL for an explanation. Using the regex pattern. It uses Virtuoso's full text index, which is significantly faster if you have a large number of literals to search from. Please see I don't think we have a tutorial no regex specifically, though I'm not in that department. If you peruse the Virtuoso documentation, you should find at least some examples there. The links above do contain a wealth of information and examples on how to use SPARQL - also in generic context. Cheers, Yrjänä" "Airpedia resource" "uDear DBpedia community, I am a PhD student from Fondazione Bruno Kessler [1] in Trento and I'm working with my team on Airpedia [2], which is a semantic resource based on machine learning techniques that aims to extend the coverage of DBpedia on classes (and, in a second step, on properties). A draft version of the resource is available on our website. we are currently working on releasing it to the Semantic Web Community, and investigating on the best RDF format to use. Actually, we use a simple CSV format. For example: #ID Class Relevance 140132 Eukaryote 10 140132 Animal 10 140132 Fish 10 140132 Species 10 140137 OlympicResult 8 140143 Eukaryote 10 140143 Amphibian 10 140143 Animal 10 140143 Species 10 The ID column refers to a WikiData ID, and can be solved on the WikiData website on the link the guessed DBpedia class; the Relevance column is our confidence about the class (from 7 to 10, being a k-NN voting, k=10). It is really easy for us to retrieve the DBpedia ID given the WikiData ID. Which is, in your opinion, the best way to represent this data in RDF, keeping in mind that we want to differentiate our triples from the original DBpedia ones and we want the relevance to be preserved? We have in mind the folowing candidate solutions. (\"air\" is our RDF namespace) *Solution 1(string concatenation)* * ID air:type Class . * ID_Class air:confidence Relevance . * sameAses For example: 140132 Eukaryote 10 140132 Animal 10 140132 Fish 10 140132 Species 10 becomes: . \"10\"^^xsd:int . . \"10\"^^xsd:int . . \"10\"^^xsd:int . . \"10\"^^xsd:int . owl:sameAs . owl:sameAs *Solution 2(blank nodes)* * ID air:isClassified bNode * bNode air:type Class * bNode air:confidence Relevance * sameAses For example: 140132 Eukaryote 10 140132 Animal 10 140132 Fish 10 140132 Species 10 becomes: _:1 . _:1 . _:1 \"10\"^^xsd:int . _:2 . _:2 . _:2 \"10\"^^xsd:int . _:3 . _:3 . _:3 \"10\"^^xsd:int . _:4 . _:4 . _:4 \"10\"^^xsd:int . owl:sameAs . owl:sameAs While waiting for your suggestions, we finish the classification and make the CSV available on our website [2]. Thank you! Best, Alessio [1] [2] I’m working with my team on Airpedia [2], which is a semantic resource based on machine learning techniques that aims to extend the coverage of DBpedia on classes (and, in a second step, on properties). A draft version of the resource is available on our website. we are currently working on releasing it to the Semantic Web Community, and investigating on the best RDF format to use. Actually, we use a simple CSV format. For example: #ID    Class    Relevance 140132    Eukaryote    10 140132    Animal    10 140132    Fish    10 140132    Species    10 140137    OlympicResult    8 140143    Eukaryote    10 140143    Amphibian    10 140143    Animal    10 140143    Species    10 The ID column refers to a WikiData ID, and can be solved on the WikiData website on the link the Class column is the guessed DBpedia class; the Relevance column is our confidence about the class (from 7 to 10, being a k-NN voting, k=10). It is really easy for us to retrieve the DBpedia ID given the WikiData ID. Which is, in your opinion, the best way to represent this data in RDF, keeping in mind that we want to differentiate our triples from the original DBpedia ones and we want the relevance to be preserved? We have in mind the folowing candidate solutions. (“air” is our RDF namespace) Solution 1 (string concatenation) ID air:type Class . ID_Class air:confidence Relevance . sameAses For example: 140132    Eukaryote    10 140132    Animal    10 140132    Fish    10 140132    Species    10 becomes: < make the CSV available on our website [2]. Thank you! Best, Alessio [1] www.airpedia.org u0€ *†H†÷  €0€1 0 + uI am a fan of the SPARQL result set format whenever people want to express tuples of nodes: I think it’s more standard than Turtle, and it is as efficient as you’ll get unless you want a binary format. This file can be processed with simple streaming tools like awk or even passed into something like Pig. If you want to load the facts into the triple store you could just toss out the relevance rating or filter only facts where the relevance rating is 9 or above. If you wanted to produce the kind of RDF you suggest, you could do that too. You could also md5sum the triples and stuff the relevance data in a key-value store where it won’t add load to the triple store. From: Alessio Palmero Aprosio Sent: Monday, June 10, 2013 11:15 AM To: dbpedia-discussion Subject: [Dbpedia-discussion] Airpedia resource Dear DBpedia community, I am a PhD student from Fondazione Bruno Kessler [1] in Trento and I’m working with my team on Airpedia [2], which is a semantic resource based on machine learning techniques that aims to extend the coverage of DBpedia on classes (and, in a second step, on properties). A draft version of the resource is available on our website. we are currently working on releasing it to the Semantic Web Community, and investigating on the best RDF format to use. Actually, we use a simple CSV format. For example: #ID Class Relevance 140132 Eukaryote 10 140132 Animal 10 140132 Fish 10 140132 Species 10 140137 OlympicResult 8 140143 Eukaryote 10 140143 Amphibian 10 140143 Animal 10 140143 Species 10 The ID column refers to a WikiData ID, and can be solved on the WikiData website on the link Which is, in your opinion, the best way to represent this data in RDF, keeping in mind that we want to differentiate our triples from the original DBpedia ones and we want the relevance to be preserved? We have in mind the folowing candidate solutions. (“air” is our RDF namespace) Solution 1 (string concatenation) aID air:type Class . bID_Class air:confidence Relevance . csameAses For example: 140132 Eukaryote 10 140132 Animal 10 140132 Fish 10 140132 Species 10 becomes: . “10”^^xsd:int . . “10”^^xsd:int . . “10”^^xsd:int . . “10”^^xsd:int . owl:sameAs . owl:sameAs Solution 2 (blank nodes) aID air:isClassified bNode bbNode air:type Class cbNode air:confidence Relevance dsameAses For example: 140132 Eukaryote 10 140132 Animal 10 140132 Fish 10 140132 Species 10 becomes: _:1 . _:1 . _:1 “10”^^xsd:int . _:2 . _:2 . _:2 “10”^^xsd:int . _:3 . _:3 . _:3 “10”^^xsd:int . _:4 . _:4 . _:4 “10”^^xsd:int . owl:sameAs . owl:sameAs . … While waiting for your suggestions, we finish the classification and make the CSV available on our website [2]. Thank you! Best, Alessio [1] [2] http://www.airpedia.org uOk, maybe I was not very clear in my previous mail: the problem was whether to use blank nodes or string concatenation (or a proposal for a better solution). We can discuss the format in a different mail thread :) Alessio Il 10/06/13 20:38, Paul A. Houle ha scritto: uCiao Alessio, maybe this can be of help (but probably you are already aware of it): Wikidata is also discussing something similar: I remember reading blank nodes are evil, but unfortunately I do not have enough experience to confirm/deny. Andrea 2013/6/11 Alessio Palmero Aprosio < > uHello everybody; I think using blank nodes is not a good practice. Shelley Powers mentioned that and the why it is inconvenient on his book \"Practical RDF\", from the page 21 of the first edition : Blank nodes (sometimes referred to as bnodes or, previously, anonymous I think Yu Liang mentioned something similar to this on his book \"A Developer's Guide to the Semantic Web\". But I am not sure of this last, you may check it if you want. Best regards; Ahmed. On 11 June 2013 11:20, Andrea Di Menna < > wrote: uHi Alessio, +1 for solution 1, maybe dirtier than solution 2, but more intuitive. WRT semantics, I suggest you to use the standard rdf:type predicate instead of for easier querying. WRT syntax (i.e., serialization), I think Turtle is the best way, both for machines (compactness) and for humans (readability). Given your example, you would obtain something like the following: @prefix dbp-owl: . @prefix air: . @prefix air-voc: . @prefix xsd: . air:140132 a dbp-owl:Animal, dbp-owl:Eukaryote, dbp-owl:Fish, dbp-owl:Species ; = , . air:140132_Animal air-voc:confidence \"10\"^^xsd:int . air:140132_Eukaryote air-voc:confidence \"10\"^^xsd:int . air:140132_Fish air-voc:confidence \"10\"^^xsd:int . air:140132_Species air-voc:confidence \"10\"^^xsd:int . Cheers! On 6/10/13 5:15 PM, Alessio Palmero Aprosio wrote:" "How to use mappingBasedExtractor without *.xls files?" "uHello! I want to use dbpedia to parse Slovenian wiki. I fixed and updated most of the extractors, that they are more translation friendly and now I want to use mappingBasedExtractor. The problem is that mapping.xls and rules.xls which I wanted to use to create the ontology were deleted in revision 1441 with log message \"No longer needed\". I googled and found this topic I want to know if I should use mapping.xls and rules.xls from previous revision and create ontology with them or wait for this new way of specifying mappings and how long would that probably be? I also see that *.csv files in ontology folder were updated even after *.xls files were removed but mapping.xls from last usefull revision is the same as one I donwnloaded in june. Best regards, Marko uHi Marko, it's great that you're working on a Slovenian extraction! In which way did you modify the extractors? Maybe we can add your changes to the repository. The definitions given directly in Wikipedia will be used for the live extraction (the group in Leipzig is working on that), while the definitions in the files are used to produce the dumps found on mapping.xls and rules.xls were replaced by mapping.csv and rules.csv. The first version of the CSV files contained the same data as the Excel files, but going forward from there, we only updated the CSV files. They use the same \"format\" - the columns have the same meanings as in the Excel files. They are described in dbpedia/ontology/docs/dbpedia_mapping.txt. When you open the CSV files with OpenOffice, you will be asked for the character encoding, field separator and text delimiter used in the file. Set the character encoding to UTF-8, the separator to \";\" (semicolon) and uncheck all other separators, and make sure that the text delimiter is empty (default is a quote). Similarly in Excel. When you adapt the mappings for the Slovenian Wikipedia, make sure that you only change the template URLs and template property names, but not the class names and ontology properties. The main reason for replacing the .xls file was that working with a binary format like .xls is hard. Finding the differences between different versions of such files is almost impossible, as is writing scripts that parse them. Our scripts that copy the mappings and rules to the database (dbpedia/ontology/mapping_db.php and dbpedia/ontology/rules_db.php) always worked on CSV files, which we had to export from OpenOffice or Excel first. Now we can avoid this extra step. Cheers, Christopher On Tue, Sep 22, 2009 at 00:05, Marko Burjek < > wrote: uHi Marko, we have recently been migrating from the xls to csv as you have already noticed. There were also some changes in the logic of the mapping file so you shouldn't use the xls anymore since the code doesn't correspond to it. The extraction and mapping database can be filled and built with the ontology/mapping_db.php (and ontology/rules_db.php) script. The database name should be dbpedia_extraction_$language. Good luck, Anja Marko Burjek schrieb: uHi, if you want Slovenian URIs you can copy the config/dbpedia-dist.ini to config/dbpedia.ini and adjust some options like: language = sk dependsOnEnglishLangLink = false dbpedia_ns = generateOWLAxiomAnnotations = false geobatchextraction = false geousedb = false persondataUseDB = false LiveMappingBased.useTemplateDb = false The last four turn off the database dependancies. Could you maybe send us your configuration of the extract_all.php script. We are currently working on enabling language specific extractions of DBpedia. The code seems to be ready, but we don't have time to test it. It would be nice if you could tell us about the adaptions you made as we are eager to include them into the code. Regards, Sebastian Jona Christopher Sahnwaldt schrieb: uHi Dne torek 22 septembra 2009 je Sebastian Hellmann napisal(a): I have already done that, but thanks anyway. BTW Slovenia is sl, sk is Slovakia ;). BTW is there any specific reason that persondataUseDB selects whole article and then searches it for link to english one instead of looking into langlinks table? Sorry I can't find that file in my source code folder. I wanted to send diffs to this list but I have mistakenly sent them only to Jona Christopher Sahnwaldt :(. Should I sent them here to? Thanks everyone for all information. wrote: uHi. I'm sending them to the list. Marko Dne sreda 23 septembra 2009 je Jona Christopher Sahnwaldt napisal(a):" "Contd. RelFinder - Version 1.0 released" "uSteffen Lohmann wrote: I've just done a few things: 1. Added a gr:BusinessEntity triples for both Apple and Google 2. Sponged the DBpedia hosted descriptions of both 3. Used the FILTER feature of Relfinder to remove seeAlso links from the mix . When you refresh the URL above, I hope you see some aspect of what my initial response and euphoria was about re. progressively generated and enhances Linked Data Spaces. More than anything else, in my world view, Linked Data is about: 1. Loose coupling of Information and Data 2. Ability to explore data relations through your own Context Lenses across their various dimensions. The steps above reduce the cost of \"Subjectivity\" and \"Myopia\", two things that always benefit from Information bearing resources that don't provide routes to their data sources. Remember, information is \"data in context\", and it's inherently subjective :-) BTW - even if RelFinder only had seeAlso links, re. my original comments, don't you see how the graph exposes erstwhile invisible data or serendipitously unveils other relevant things (again very much dependent on the context of the beholder). In my case, I found the keynote-speakers node interesting (amongst others). Kingsley" "URI encoding scheme for dbpedia.org" "uHello! We keep hitting various URI-encoding related issues in the last couple of weeks. The rules at that brackets should be escaped. However for a number of resources it doesn't appear to be the case, e.g. of the information - YAGO types) vs. but no YAGO types). Could it be caused by Best, Yves uHi Yves, This is a bug from the yago dataset. You can see in [1] for more info. Best, Dimitris [1] On Mon, Jul 23, 2012 at 1:37 PM, Yves Raimond < >wrote: uHello! I am even more confused after reading that email :) In the example the existence of the second is a bug. The URIs used in the YAGO dump were not properly encoded before loading (as you can see this resource only has YAGO properties). This will be fixed in the next release. In the example below and for lots of other URIs we're dealing with, this is exactly the inverse. The %-encoded URI is the one appearing in the YAGO dataset. The non-encoded URI seems to be the 'real' one. So which URI should we be using? Best, y On Mon, Jul 23, 2012 at 11:50 AM, Dimitris Kontokostas < > wrote: uwell, it's a bug, so both :) If you want to retrieve yago, some or all of them do not decode the '(' / ')' If you wait a while for the new release this should be resolved Best, Dimitris On Mon, Jul 23, 2012 at 1:56 PM, Yves Raimond < >wrote: uSo their URIs, I suppose that in the example below it will move to information. But that one doesn't have its brackets escaped, which the URI encoding rules say it should. If that is the case, would it be possible to update that page to describe the updated encoding rules? Best, y On Mon, Jul 23, 2012 at 1:28 PM, Dimitris Kontokostas < > wrote: uOn Mon, Jul 23, 2012 at 3:37 PM, Yves Raimond < >wrote: No. To be more specific, the bug/error only is in this external file which was loaded in virtuoso. This file only contains (some) wrongly encoded DBpedia resource URIs Best, Dimitris uOn Mon, Jul 23, 2012 at 2:19 PM, Dimitris Kontokostas < > wrote: Sorry to be a bit pushy, but that dump actually has URIs that are formatted rightly according the URI encoding guidelines, with brackets escaped, which is my main point. Just to sum up: * URI encoding guidelines say brackets should be %-escaped * Yago dump has them %-escaped * DBpedia dump doesn't have them %-escaped Which I hope explains why I find all that very confusing! Best, y uOn 7/23/12 9:53 AM, Yves Raimond wrote: Conclusion, the dump at: which is based on Yago contains incorrect mappings, right? Kingsley uWell, my problem is more that the current DBpedia dump doesn't seem to apply the URI encoding rules at makes it basically impossible to predict what Wikipedia URI corresponds to what DBpedia URI. This Yago dump escapes brackets, exactly as described in those encoding rules. Best, y uOn 7/24/12 4:50 AM, Yves Raimond wrote: Okay, just wanted to be clear as a new DBpedia release is imminent :-) uHi Yves, a bit closer to Wikipedia encoding and don't escape brackets anymore. It's all a bit confusing right now because data, which is not yet available for download, but will be very soon. The main datasets don't escape brackets anymore, some link datasets still do. We should probably fix the link datasets. I'll update Meanwhile, the escaping rules can be found in this Scala code: private val iriReplacements = StringUtils.replacements('%', \"\\"#% ?[\\]^`{|}\") This means that only the following characters are URI-encoded (\"percent-encoded\"). The first character is a double quote. \"#% ?[\]^`{|} As usual, space is replaced by underscore. The additionally we URI-encode all non-ASCII characters. So much for now, Christopher On Tue, Jul 24, 2012 at 10:50 AM, Yves Raimond < > wrote: uHello! That's great - thank you Christopher! Best, y" "A problem with categories datasets in DB3.8" "uHi Friends, I want to extract wiki articles category graph and find your datasets fortunately to avoid parsing the huge dump by myself. Thank you so much for the effort. However, I found something strange doing BFS on the graph based on Categories(Skos): there are lots of category nodes that cannot be reached. I checked briefly and found that it seems to lost some \"belongs to\" link between subcategory and category. For example, \"Category: British_People_By_Occupation\" should belong to \"Category: British_People\" while the dataset does not contain such info.(it only contain the record that this category is a core concept) Could you please help check it? Thanks a lot. Regards, uHi Ning, Can you please confirm that the same thing does not happen in Wikipedia too? Best, Dimitris On Mon, Jan 28, 2013 at 7:06 AM, Ning Zhang < > wrote: uHi Dimistris, does not seem so: Cheers Andrea 2013/1/28 Dimitris Kontokostas < > uThanks Andrea, @Ning, DBpedia tries to be an exact semantic mirror of Wikipedia so if you want to fix these \"errors\" you should try to fix them at the source (which is Wikipedia) and on the next DBpedia release they will be fixed Best, Dimitris On Mon, Jan 28, 2013 at 4:14 PM, Andrea Di Menna < > wrote: uI am sorry :) I meant there exists a link in those pages. (picked the incorrect words to express myself). There could an issue in the SkosCategoriesExtractor. If I am not wrong, the triple should be collected when analysing the How can I get a minimal example to run the extractor on and try to debug it? Regards Andrea 2013/1/28 Dimitris Kontokostas < > udon't worry :) This is an old issue. In this case Wikipedia applies categories through special templates for instance: {{nationality by occupation|Country=United Kingdom|Nationality=British}} The framework cannot expand the templates and collects only the handwritten categories (none in this case) For testing you can do 2 things: 1) use a trimmed dump and run the dump based extraction as normal 2) use the server module, it should open something like this [1] and then you could try single page extraction directly via wikipedia [2]. This is how I used to test single pages, however, I am not 100% sure that it is still valid, Jona Christopher (in cc) did a lot of changes last year in this module so maybe he can give some input @JC, Can you confirm this? Best, Dimitris [1] [2] On Mon, Jan 28, 2013 at 4:34 PM, Andrea Di Menna < > wrote: uThank you all. @Dimitris&Andrea;, if it comes to be a bug of the extractor, then could you give me a brief estimation of how long it will take to re-extract it? I will just keep looking on this discussion and just let me when any conclusion is gotten or any further work I can help do. Best, Ning On Mon, Jan 28, 2013 at 10:06 AM, Dimitris Kontokostas < > wrote: uHey Ning, This is not exactly a bug. The problem with templates is that are usually very complicated and we would have to re-implement the MediaWiki engine in Java in order to parse and expand them correctly (or come up with another alternative like the use the MW API somehow in the extraction process). This requires a lot of changes in the code for a very small added value (like your case) and is not likely to be implemented soon. Unless of course someone from the community volunteers to help ;) Another approach for this specific case would be to gather similar templates and see if we can add them as parameters of the CategoryExtractor The first part here is to try and gather all similar templates, and if they get too many we should focus on the first and more genera approach. Best, Dimitris On Mon, Jan 28, 2013 at 6:28 PM, Ning Zhang < > wrote: uThanks for the explanation:) I fully understand the situation and I am really shocked by the great framework you have built. However I just want to understand what happens here. If this problem comes because the CategoryExtractor does not know how to expand some templates, how can we solve it by changing the WM engine? In the alternative way, do you mean that to implement such expansion/substitution is too complicate for the extractor so using MW API will be better? Best, Ning On Mon, Jan 28, 2013 at 1:17 PM, Dimitris Kontokostas < > wrote: uThe framework is written in Java / Scala so, in order to expand a template instance inside an article there are two options 1) re-implement (part of) the PHP MW engine in Java / Scala or 2) use the MW API during the extraction process Both of the approaches introduce complexities & dependencies for a small added-value. The first one will be faster while the second more accurate but I am not sure which one I'd choose. For the \"Third\" approach (passing these templates as configurations) we did something similar with dates see the variable templateDateMap in the following files for an example core/src/main/scala/org/dbpedia/extraction/dataparser/DateTimeParser.scala (use) core/src/main/scala/org/dbpedia/extraction/config/dataparser/DateTimeParserConfig.scala (definition) This could also be an option if the number of category templates is small, but needs some further research in Wikipedia Best, Dimitris On Mon, Jan 28, 2013 at 11:05 PM, Ning Zhang < > wrote:" "About classification, inference and the BDpedia Ontology" "uHi list, I would like to know a little bit more about how the classification involving the DBpedia Ontology works (that is, dbpedia_3.5.1.owl) . For example, it might come really handy for what we are developing to have a list of things classified (or ordered) by the bdpedia-owl classe's names. So far I managed to get more or less what I intended: PREFIX owl: PREFIX dbo: PREFIX rdfs: SELECT ?className ?class ?name ?summary ?uri where { ?uri rdfs:label ?name ; rdfs:comment ?summary ; a ?class ; dbo:country . ?class rdfs:label ?className ; rdfs:subClassOf owl:Thing . FILTER( lang(?name) = 'pt' && lang(?summary) = 'pt' ) } ORDER BY ?className Question 1: Is there a better way to restrict the names of the classes of this resources to only dbpedia-owl (so names of YAGO and other classes don't show) other than specifying rdfs:subClassOf owl:Thing ? Because right now it works but maybe in the future other classes (besides from dbpedia-owl) are also subclass of thingin fact, what makes sense for me is that all concepts should be So right now I get the response with the labels of the classes in english because I'm guessing the onthology lacks any more labels for other languages. Question 2: If I add labels in portuguese to classes in the mappings wiki (like in here it show in the next release of DBpedia and in the public SPARQL end-point, so my query above returns also this portuguese names and I'm able to classify based on portuguese names of the classes? I saw I already got editor permissions in the wiki, thanks for that! Question 3: Is there a roadmap for DBpedia releases? Question 4: Is there any kind of inference going on when retrieving things from DBpedia so I can asume some things? (for example, that all writers are also persons, etc) Is there a way to refine this rules to add support for better inferences or better awareness of cross-onthology classification? In the portuguese case, it seems to me easier if there is way to say \"all the things being skos:subject are also dbpedia-owl:Writer\" than to try to edit one by one the articles so they end up being explicitly dbpedia-owl:Writer . Sorry for the long post, thanks for reading ;). I appreciate any light on this, hopefully we'll be able to adopt DBpedia as a central part of the application we are developing around here (I hope so). Regards, Alex u> things from DBpedia so I can asume some things? (for example, that all > writers are also persons, etc) Is there a way to refine this rules to > add support for better inferences or better awareness of cross-onthology > classification? In the portuguese case, it seems to me easier if there > is way to say \"all the things being skos:subject > are also > dbpedia-owl:Writer\" than to try to edit one by one the articles so they > end up being explicitly dbpedia-owl:Writer . Responding to myself, I found this on the matter of inference/reasoning and categories: Haven't tried that yet, but I hope there is (or there will be) a more concise way of achieving the same. Still wondering if there is *ANY* inference going on in the process of quering DBpedia, and (if any) who and when it is done (when quering in the rdf store, when mapping wikipedia so it all becomes explicit and later there is no need for inference, etc). uInference is performed in backward-chaining fashion and optional. The DBpedia linksets are sometimes derived from a forward-chaining inference workflow. Only if inference context is enabled which is done by specifically using SPARQL pragmas to apply inference rules to SPARQL patterns. There are some pre-configured inference rules associated with the DBpedia instance which are visible via the interface at: can use \"Settings\" to select inference rules from a drop-down etcThe inference rules names can also be used in your Virtuoso SPARQL Query pragmas. Links: 1. SPARQL_Tutorials_Part_2.html uHi, Yes, it should be included in the next release. Also it will be available here: and here directly after you edit it (give or take some minutes) Now, here is the query that gives you the ontology from DBpedia-Live Note that its from a different graph. I'm also not sure, if all the Portuguese é è ã come out well, but we are working on it. PREFIX meta: CONSTRUCT {?s ?p ?o} FROM WHERE { ?b meta:origin meta:TBoxExtractor . ?b owl:annotatedSource ?s . ?b owl:annotatedProperty ?p . ?b owl:annotatedTarget ?o . FILTER(!(?p IN ( meta:editlink, meta:revisionlink, meta:oaiidentifier, ))).} cheers, Sebastian uThanks! I'll be doing testing on the live instance, which I just knew about yesterday it existed. Regards, Alex Em 09-09-2010 17:48, Sebastian Hellmann escreveu: uThanks, this is great stuff, although I kind of still get lost in the terms, being new to the whole LOD and semantic / ontology stuff. Do any of you know any other good mailing list or forum with higher traffic to address more general questions, maybe not specific to DBpedia? Regards, Alex Em 09-09-2010 14:34, Kingsley Idehen escreveu:" "Population density mapping broken" "uHello! Your population density mapping for the country infobox is broken. See for example india. The use of expr in the wikipedia infobox are causing the parsing to fail, which results in lies. I can't post this on the wiki because I can't edit it so, I thought I'd just spam you instead. Bye! Hello! Your population density mapping for the country infobox is broken. See for example india. The use of expr in the wikipedia infobox are causing the parsing to fail, which results in lies. I can't post this on the wiki because I can't edit it so, I thought I'd just spam you instead. Bye!" "lost in DBpedia: can't find all the cities and states in US" "uDear all, I need the US cities, counties and states information. I have searched DBpedia for quite a while to search all the US states, cities. I am using : SELECT DISTINCT * WHERE { ?city dbpedia-owl:isPartOf ?state . ?city dbpedia-owl:type dbpedia:City. ?state rdf:type yago:StatesOfTheUnitedStates } There are only 47 states and 8463 cities there. No New_Jersey, Alaska, Delaware, in the result. Can someone help me with the query? How can I find all the states and their cities and counties? Many thanks, Asiyah Jedi Order: There is no emotion, there is peace. There is no ignorance, there is knowledge. There is no passion, there is serenity. There is no chaos, there is harmony. There is no death, there is Force. Our Jedi Code: May peace and force be with you. Dear all, I need the US cities, counties and states information. I have searched DBpedia for quite a while to search all the US states, cities. I am using : SELECT DISTINCT * WHERE { ?city dbpedia-owl:isPartOf ?state . ?city dbpedia-owl:type    dbpedia:City. ?state rdf:type yago:StatesOfTheUnitedStates } There are only 47 states and 8463 cities there. No New_Jersey, Alaska, Delaware, in the result. Can someone help me with the query? How can I find all the states and their cities and counties? Many thanks, Asiyah Jedi Order: There is no emotion, there is peace. There is no ignorance, there is knowledge. There is no passion, there is serenity. There is no chaos, there is harmony. There is no death, there is Force. Our Jedi Code: May peace and force be with you. uAsiyah, Your query does not show Alaska or Delaware because there are no \"cities\" in those states (at least as far as that query is concerned). For example the dbpedia resource for < match it. DBpedia extracts information from Wikipedia, and if you edit the Wikipedia page for Wilmington, Delaware you will see that it uses the {{infobox settlement}} template but doesn't give a `settlement_type`, so DBpedia's mapping only calls it a generic Settlement uThank you Rob. I found there are a lot of relations between cities and states,so using OPTIONAL is my next step. Also, thank you very much for pointing out that some cities are not 'City' type in DBpedia. I certainly should look at that. As for another approach, if I change the wikipedia pages, how long it will take to transfer the change in DBpedia? Thanks, Asiyah Jedi Order: There is no emotion, there is peace. There is no ignorance, there is knowledge. There is no passion, there is serenity. There is no chaos, there is harmony. There is no death, there is Force. Our Jedi Code: May peace and force be with you. On Sat, Jan 31, 2015 at 9:05 PM, Rob Hunter < > wrote: uAsiyah, A new release of DBpedia only happens once in a while (about every year or so) but there is also something called \"DBpedia Live\" which listens to changes in Wikipedia and in the DBpedia mappings wiki, so it updates almost constantly. You can run your query against DBpedia Live using all the same software, just a different SPARQL endpoint. On 01/02/2015 1:36 PM, \"Asiyah Yu Lin\" < > wrote: OPTIONAL is my next step. 'City' type in DBpedia. will take to transfer the change in DBpedia? uHI, Rob, Thank you very much for you help! I will try them out. Much appreciated, Asiyah Jedi Order: There is no emotion, there is peace. There is no ignorance, there is knowledge. There is no passion, there is serenity. There is no chaos, there is harmony. There is no death, there is Force. Our Jedi Code: May peace and force be with you. On Sat, Jan 31, 2015 at 10:54 PM, Rob Hunter < > wrote:" "Question about dbpedia extraction framework" "uHi guys, yesterday I have asked this question, but unfortunately no body answer :( can any one answer my inquiry . Many thanks in advance. My question was : I would like to ask how can I install or run dbpedia Extraction-Framework in windows 7. Actually I tried to install it by using Ubuntu, but unfortunately it does not work with (i.e. there are some steps that do not work with me, I do not know why? and I'm not expertise with Linux). Actually I prefer to work on windows environment, so is it possible to run dbpedia Extraction-Framework on windows? if its possible so could you kindly explain to me step by step how can run it please? Many thanks. With Regards,Abdullah uHello Abdullah , if you are going to use windows you have to tailor everything to the windows environment , like setting installing windows versions from everything , scala , intellij , maven etc and as well as the Unix command line tools on windows cuz the extraction framework contains some bash command that you will need the to install it in order to run but overall theoretically it should work yes , luckily there's intellij for windows , so most of the procedures will be the same , not sure what problems you faced in ubuntu anbd either it will be repeated on windows or not. could you please be more specific ? here's the general procedures you should follow to install the instruction framework ps: the Mercurial repository is not supported anymore , instead clone the repo from github git clone git://github.com/dbpedia/extraction-framework.git 1. Installing scala 2. Download and Install Sun JDK 7 3. Download and install Intellij 12 4. Restart the pc and Open intellij 5. install scala plugin on intellij 6. Download and install Maven 3.0.5 7. add M2_HOME to environment variables 8. install Git 9. clone the repository 10. after this you can continue in the tutorial , windows and linux shouldn't differ thanks Regards On Thu, Jun 27, 2013 at 3:02 PM, Abdullah Nasser < > wrote: u‰PNG  uHi Abdullah, there was a \"has\" concatenated by mistake at the end of the link Can you try and report back? Also note that this is needed only for the AbstractExtractor, everything should work just with maven Dimitris On Thu, Jun 27, 2013 at 9:32 PM, Abdullah Nasser < > wrote: uDear Dimitris, Many thanks for your help. Yes the link work now with me. What I'm looking for it is extracting semantic links from wikipedia pages,so are steps that I'm trying to perform in this page allow me to extract semantic links? If not can you please guide me to the proper page which allows me to extract semantic link from wiki pages. Many thanks in advance. With Regards, Abdullah From: Date: Sat, 29 Jun 2013 12:32:05 +0300 Subject: Re: [Dbpedia-discussion] Question about dbpedia extraction framework To: CC: ; Hi Abdullah, there was a \"has\" concatenated by mistake at the end of the link Can you try and report back? Also note that this is needed only for the AbstractExtractor, everything should work just with maven Dimitris On Thu, Jun 27, 2013 at 9:32 PM, Abdullah Nasser < > wrote: Dear Hady, Thank you very much for your response and clarification. I tired to run the project in Ubuntu I kept trying from last week , but unfortunately some files could not be reach by apache server for example when I reached to the section called : Prepare MediaWiki - Configuration and Settings:Please make sure the patch this link did not open with me it returns error. Moreover the step following this step did not work with me which is : Configure your MediaWiki directory as web-directory by adding configuration information into Apache httpd.conf as below: Alias /mediawiki /path_to_mediawiki_parent_dir/mediawiki Allow from all I do not know why I have mediawiki folder in two paths, where tried to test both of them, but they did not work. The first path is : Alias /mediawiki /home/ansalm/core/resources/mediawiki Allow from all I had store these lines in httpd.conf file, but when I clicked on the link that under : Verify MediaWiki and PHP configurations Now visit the following URL with your browser it does not work with me. The second path that I have tried to apply is: Alias /mediawiki /home/ansalm/extraction-framework/dump/src/main/mediawiki Allow from all but unfortunately it is still not give the same error when I press on the link Now visit the following URL with your browser please check The error snapshot that I got, which attached in this message where the attached screen-shot appears when I clicked on the previous links. I asked you about windows, because I'm working on windows 7 and I'm using Netbeans. So, which operating system do you recommend that can run dbpedia extraction-framework easily windows or Ubuntu? I spent too much time trying to run it on Ubuntu, but I could not. Many thanks for your helps. With Regards, Abdullah Date: Thu, 27 Jun 2013 15:46:40 +0200 Subject: Re: [Dbpedia-discussion] Question about dbpedia extraction framework From: To: CC: Hello Abdullah , if you are going to use windows you have to tailor everything to the windows environment , like setting installing windows versions from everything , scala , intellij , maven etc and as well as the Unix command line tools on windows cuz the extraction framework contains some bash command that you will need the to install it in order to run but overall theoretically it should work yes , luckily there's intellij for windows , so most of the procedures will be the same , not sure what problems you faced in ubuntu anbd either it will be repeated on windows or not. could you please be more specific ? here's the general procedures you should follow to install the instruction framework ps: the Mercurial repository is not supported anymore , instead clone the repo from github git clone git://github.com/dbpedia/extraction-framework.git Installing scala Download and Install Sun JDK 7 Download and install Intellij 12Restart the pc and Open intellij install scala plugin on intellijDownload and install Maven 3.0.5 add M2_HOME to environment variablesinstall Git clone the repository after this you can continue in the tutorial , windows and linux shouldn't differ thanks Regards On Thu, Jun 27, 2013 at 3:02 PM, Abdullah Nasser wrote: Hi guys, yesterday I have asked this question, but unfortunately no body answer :( can any one answer my inquiry . Many thanks in advance. My question was : I would like to ask how can I install or run dbpedia Extraction-Framework in windows 7. Actually I tried to install it by using Ubuntu, but unfortunately it does not work with (i.e. there are some steps that do not work with me, I do not know why? and I'm not expertise with Linux). Actually I prefer to work on windows environment, so is it possible to run dbpedia Extraction-Framework on windows? if its possible so could you kindly explain to me step by step how can run it please? Many thanks. With Regards,Abdullah uDear Abdullah, This paper [1] explains in detail how the framework works. We provide a big number of pluggable extractors and their output is explained in table 1. You should filter the extractors you need and enable *only* the relevant ones in your \"extraction-config-file\". You may also find some extra info in the DBpedia download page [2] Best, Dimitris [1] [2] On Sat, Jun 29, 2013 at 3:55 PM, Abdullah Nasser < > wrote: uDear Dimitris, Many thanks for help I will read this paper and try to extract semantic knowledage from wikipedia pages. With Regards, Abdullah From: Date: Sat, 29 Jun 2013 21:03:19 +0300 Subject: Re: [Dbpedia-discussion] Question about dbpedia extraction framework To: CC: ; Dear Abdullah, This paper [1] explains in detail how the framework works. We provide a big number of pluggable extractors and their output is explained in table 1. You should filter the extractors you need and enable *only* the relevant ones in your \"extraction-config-file\". You may also find some extra info in the DBpedia download page [2] Best, Dimitris [1] [2] On Sat, Jun 29, 2013 at 3:55 PM, Abdullah Nasser < > wrote: Dear Dimitris, Many thanks for your help. Yes the link work now with me. What I'm looking for it is extracting semantic links from wikipedia pages,so are steps that I'm trying to perform in this page allow me to extract semantic links? If not can you please guide me to the proper page which allows me to extract semantic link from wiki pages. Many thanks in advance. With Regards, Abdullah From: Date: Sat, 29 Jun 2013 12:32:05 +0300 Subject: Re: [Dbpedia-discussion] Question about dbpedia extraction framework To: CC: ; dbpedia-discussion@lists.sourceforge.net Hi Abdullah, there was a \"has\" concatenated by mistake at the end of the link Can you try and report back? Also note that this is needed only for the AbstractExtractor, everything should work just with maven Dimitris On Thu, Jun 27, 2013 at 9:32 PM, Abdullah Nasser wrote: Dear Hady, Thank you very much for your response and clarification. I tired to run the project in Ubuntu I kept trying from last week , but unfortunately some files could not be reach by apache server for example when I reached to the section called : Prepare MediaWiki - Configuration and Settings:Please make sure the patch this link did not open with me it returns error. Moreover the step following this step did not work with me which is : Configure your MediaWiki directory as web-directory by adding configuration information into Apache httpd.conf as below: Alias /mediawiki /path_to_mediawiki_parent_dir/mediawiki Allow from all I do not know why I have mediawiki folder in two paths, where tried to test both of them, but they did not work. The first path is : Alias /mediawiki /home/ansalm/core/resources/mediawiki Allow from all I had store these lines in httpd.conf file, but when I clicked on the link that under : Verify MediaWiki and PHP configurations Now visit the following URL with your browser it does not work with me. The second path that I have tried to apply is: Alias /mediawiki /home/ansalm/extraction-framework/dump/src/main/mediawiki Allow from all but unfortunately it is still not give the same error when I press on the link Now visit the following URL with your browser please check The error snapshot that I got, which attached in this message where the attached screen-shot appears when I clicked on the previous links. I asked you about windows, because I'm working on windows 7 and I'm using Netbeans. So, which operating system do you recommend that can run dbpedia extraction-framework easily windows or Ubuntu? I spent too much time trying to run it on Ubuntu, but I could not. Many thanks for your helps. With Regards, Abdullah Date: Thu, 27 Jun 2013 15:46:40 +0200 Subject: Re: [Dbpedia-discussion] Question about dbpedia extraction framework From: hadyelsahar@gmail.com To: anm414@hotmail.com CC: dbpedia-discussion@lists.sourceforge.net Hello Abdullah , if you are going to use windows you have to tailor everything to the windows environment , like setting installing windows versions from everything , scala , intellij , maven etc and as well as the Unix command line tools on windows cuz the extraction framework contains some bash command that you will need the to install it in order to run but overall theoretically it should work yes , luckily there's intellij for windows , so most of the procedures will be the same , not sure what problems you faced in ubuntu anbd either it will be repeated on windows or not. could you please be more specific ? here's the general procedures you should follow to install the instruction framework ps: the Mercurial repository is not supported anymore , instead clone the repo from github git clone git://github.com/dbpedia/extraction-framework.git Installing scala Download and Install Sun JDK 7 Download and install Intellij 12Restart the pc and Open intellij install scala plugin on intellijDownload and install Maven 3.0.5 add M2_HOME to environment variablesinstall Git clone the repository after this you can continue in the tutorial , windows and linux shouldn't differ thanks Regards On Thu, Jun 27, 2013 at 3:02 PM, Abdullah Nasser wrote: Hi guys, yesterday I have asked this question, but unfortunately no body answer :( can any one answer my inquiry . Many thanks in advance. My question was : I would like to ask how can I install or run dbpedia Extraction-Framework in windows 7. Actually I tried to install it by using Ubuntu, but unfortunately it does not work with (i.e. there are some steps that do not work with me, I do not know why? and I'm not expertise with Linux). Actually I prefer to work on windows environment, so is it possible to run dbpedia Extraction-Framework on windows? if its possible so could you kindly explain to me step by step how can run it please? Many thanks. With Regards,Abdullah uDear Hady, I'm really sorry to disturb you with me. I spent around month trying to run DBpedia Extraction framework in windows, however, it does not work with me. Mr. Dimitris advised me to read a paper that contains different tool which perform the same thing that done by DBpedia Extraction framework such as \"JWPL\" , however, when I contact with mailing list for JWPL they advised me to work on DBpedia Extraction framework which more accurate in extracting semantic information from wikipedia pages. I followed the 10 steps that you had gave me in your last e-mail, all 9 step I think they work with me (not quite sure!, where some steps such as intsllaing intelliJ they write\"On the right side of Intelli J, open a tab entitled “Maven Projects”. Click on the recycle button(«Reimport all maven projects”). The compilation of the source files starts.\" and they also write \"Open “Parent POM of the DBpedia framework” -> Life Cycle and select (execute) the commands clean and install. It will take a long time to compile the whole project.\", but I did not find any of them when I installed intelliJ ), when I have reached to step number 10 ,which \"after this you can continue in the tutorial , windows and linux shouldn't differ\", I got confused where whether shall I start from the beginning of this link or from where I should start!: Many thanks in advance With Regards,Abdullah Date: Thu, 27 Jun 2013 15:46:40 +0200 Subject: Re: [Dbpedia-discussion] Question about dbpedia extraction framework From: To: CC: Hello Abdullah , if you are going to use windows you have to tailor everything to the windows environment , like setting installing windows versions from everything , scala , intellij , maven etc and as well as the Unix command line tools on windows cuz the extraction framework contains some bash command that you will need the to install it in order to run but overall theoretically it should work yes , luckily there's intellij for windows , so most of the procedures will be the same , not sure what problems you faced in ubuntu anbd either it will be repeated on windows or not. could you please be more specific ? here's the general procedures you should follow to install the instruction framework ps: the Mercurial repository is not supported anymore , instead clone the repo from github git clone git://github.com/dbpedia/extraction-framework.git Installing scala Download and Install Sun JDK 7 Download and install Intellij 12Restart the pc and Open intellij install scala plugin on intellijDownload and install Maven 3.0.5 add M2_HOME to environment variablesinstall Git clone the repository after this you can continue in the tutorial , windows and linux shouldn't differ thanks Regards On Thu, Jun 27, 2013 at 3:02 PM, Abdullah Nasser < > wrote: Hi guys, yesterday I have asked this question, but unfortunately no body answer :( can any one answer my inquiry . Many thanks in advance. My question was : I would like to ask how can I install or run dbpedia Extraction-Framework in windows 7. Actually I tried to install it by using Ubuntu, but unfortunately it does not work with (i.e. there are some steps that do not work with me, I do not know why? and I'm not expertise with Linux). Actually I prefer to work on windows environment, so is it possible to run dbpedia Extraction-Framework on windows? if its possible so could you kindly explain to me step by step how can run it please? Many thanks. With Regards,Abdullah uDear All, Please can anyone help me with problem that I have found in running DBpedia Extraction framework in windows 7. Do you have tutorial explain running the DBpedia Extraction framework?. I'm really stock with installing DBpedia Extraction framework I followed all steps mentioned by Mr. Hady , but unfortunately some steps does not work with me as I mentioned to Mr. Hady. Many thanks in advance. With Regards,Abdullah From: To: Date: Wed, 24 Jul 2013 22:56:19 +0000 CC: Subject: Re: [Dbpedia-discussion] Question about dbpedia extraction framework Dear Hady, I'm really sorry to disturb you with me. I spent around month trying to run DBpedia Extraction framework in windows, however, it does not work with me. Mr. Dimitris advised me to read a paper that contains different tool which perform the same thing that done by DBpedia Extraction framework such as \"JWPL\" , however, when I contact with mailing list for JWPL they advised me to work on DBpedia Extraction framework which more accurate in extracting semantic information from wikipedia pages. I followed the 10 steps that you had gave me in your last e-mail, all 9 step I think they work with me (not quite sure!, where some steps such as intsllaing intelliJ they write\"On the right side of Intelli J, open a tab entitled “Maven Projects”. Click on the recycle button(«Reimport all maven projects”). The compilation of the source files starts.\" and they also write \"Open “Parent POM of the DBpedia framework” -> Life Cycle and select (execute) the commands clean and install. It will take a long time to compile the whole project.\", but I did not find any of them when I installed intelliJ ), when I have reached to step number 10 ,which \"after this you can continue in the tutorial , windows and linux shouldn't differ\", I got confused where whether shall I start from the beginning of this link or from where I should start!: Many thanks in advance With Regards,Abdullah Date: Thu, 27 Jun 2013 15:46:40 +0200 Subject: Re: [Dbpedia-discussion] Question about dbpedia extraction framework From: To: CC: Hello Abdullah , if you are going to use windows you have to tailor everything to the windows environment , like setting installing windows versions from everything , scala , intellij , maven etc and as well as the Unix command line tools on windows cuz the extraction framework contains some bash command that you will need the to install it in order to run but overall theoretically it should work yes , luckily there's intellij for windows , so most of the procedures will be the same , not sure what problems you faced in ubuntu anbd either it will be repeated on windows or not. could you please be more specific ? here's the general procedures you should follow to install the instruction framework ps: the Mercurial repository is not supported anymore , instead clone the repo from github git clone git://github.com/dbpedia/extraction-framework.git Installing scala Download and Install Sun JDK 7 Download and install Intellij 12Restart the pc and Open intellij install scala plugin on intellijDownload and install Maven 3.0.5 add M2_HOME to environment variablesinstall Git clone the repository after this you can continue in the tutorial , windows and linux shouldn't differ thanks Regards On Thu, Jun 27, 2013 at 3:02 PM, Abdullah Nasser < > wrote: Hi guys, yesterday I have asked this question, but unfortunately no body answer :( can any one answer my inquiry . Many thanks in advance. My question was : I would like to ask how can I install or run dbpedia Extraction-Framework in windows 7. Actually I tried to install it by using Ubuntu, but unfortunately it does not work with (i.e. there are some steps that do not work with me, I do not know why? and I'm not expertise with Linux). Actually I prefer to work on windows environment, so is it possible to run dbpedia Extraction-Framework on windows? if its possible so could you kindly explain to me step by step how can run it please? Many thanks. With Regards,Abdullah" "Apostrophe encoding" "uHi, I am wondering why apostrophe encoding does not appear to be working when looking at resources directly? One of these should work, no? thanks, Csaba" "dump process problem (tables.sql not found)" "uHello, i am interested in DBpedia extraction framework, and want to compile and run the framework. i started from the link and follow the steps i have a problem in the dump process i install the framework into your maven repository by running  mvn install from and ithe extraction directory. and it is successfully ended and get [INFO] Parent POM of the DBpedia framework SUCCESS [10.858s] [INFO] Core Libraries SUCCESS [10.982s] [INFO] Wikipedia dump SUCCESS [2:23.660 s] [INFO] server?? SUCCESS [1.170s] [INFO] Live extraction SUCCESS [5.647s] [INFO] scripts?? SUCCESS [0.858s] [INFO] Wiktionary Dump SUCCESS [1.373s] then in step of dump process  mvn scala:run from the directory extraction/dump. i get error(java exception) which is Exception in thread \"Thread-1\" java.io.FileNotFoundException: d:\wikipediaDump\t ables.sql (The system cannot find the path specified)         at java.io.FileOutputStream.open(Native Method)         at java.io.FileOutputStream. (FileOutputStream.java:179)         at java.io.FileOutputStream. (FileOutputStream.java:70)         at java.io.PrintStream. (PrintStream.java:196)         at org.dbpedia.extraction.dump.Download$DumpDownloader$.downloadMWTable( Download.scala:89)         at org.dbpedia.extraction.dump.Download$.download(Download.scala:23)         at org.dbpedia.extraction.dump.ConfigLoader$.updateDumps(ConfigLoader.sc ala:61)         at org.dbpedia.extraction.dump.ConfigLoader$.load(ConfigLoader.scala:33)         at org.dbpedia.extraction.dump.Extract$ExtractionThread.run(Extract.scal a:31) [INFO] uHi, On Fri, Nov 12, 2010 at 2:51 AM, amira Ibrahim abd el-atey < > wrote: There was a small bug in the download program that updates the Wikipedia dumps. It is now fixed. But as I can see the downloads of the Wikipedia dumpy won't work at the moment anyways because the download page ( maintenance\". If you already have a Wikipedia XML dump locally, you can disable updateDumps in config.properties and place the XML dump it in dumpDir (within a subfolder that is named after the dump date). Then, the framework will not attempt to download the most recent dump and start the extraction right away. Best, Max" "Fw: request" "u   helloi want to participate in mapping Arabic DBpedia chapter, can you create username and password (account) for me. or forward my request to who can help me. note: my account on wiki is: amanyslamaa and my acoount on Dbpedia is: RegardsEng.Amany SlamaaTeaching AssistantComputer  and Information Science hello i want to participate in mapping Arabic DBpedia chapter, can you create username and password (account) for me. or forward my request to who can help me. note: my account on wiki is: amanyslamaa and my acoount on Dbpedia is: Best Regards Eng.Amany Slamaa Teaching Assistant Computer and Information Science" "Birthplaces query" "uHi Stephen, Thanks for your interest in DBpedia. I send a copy of my answer to our discussion list [1] (you are very much invited to join), since it might be of interest for other members of the DBpedia community too. Yes it is! The DBpedia project offers a SPARQL endpoint [2] as well as a query builder [3]. Both can be used to answer queries like yours. I created a query using our query builder for you. You can review the result at: (Maybe someone else from the list creates a corresponding SPARQL query for you.) As you can see there are only 138 results :-( What's the reason? If you look at a query retrieving a list of people together with their birth dates, death dates and birth places we have 3487 results: There are around 50k places with longitude and latitude in DBpedia. The reason, why we unfortunately can not get both together with a simple query is, that the birth place of people often just contains a string representation (maybe sometimes including a reference/link) of the birth place. Hence, pattern matching techniques have to be applied to get more matching results. Unfortunately, I'm right now to busy to help with that, but you are invited to play around yourself and maybe someone else on the list is able to provide some support. Also, the improved datasets, which we hope to release soon might result in more precise query answering. Best regards, Sören [1] [2] [3] http://wikipedia.aksw.org" "How do I query wikilinks?" "uHi, I'm using: I ran this query: select ?from, ?to where { ?from dbpprop:wikilink ?to. } and received no results. I then saw that that predicate had been deprecated for some reason, so I tried its replacement: select ?from, ?to where { ?from dbpedia-owl:wikiPageWikiLink ?to. } This also gave me no results. Why? How do I query wikilinks? More generally, how can I see which predicates are available? I tried: select distinct ?pred where { ?from ?pred ?to. filter(regex(?pred, \"link\")). } But this also returns no results. Probably because I don't know SPARQL, and filter.regex doesn't work on predicates for some reason. I then tried select distinct ?pred where { ?from ?pred ?to. } and looked for any predicates which looked like \"has a wikilink to\". I found external reference links, not wikilinks. (I was only able to find out what it means by guessing uHi James, Not all DBpedia datasets are loaded unto the endpoint. The page links data set is not loaded as can be seen from [1], mostly due to its size. Actually, the fact that for 3.9 these data were loaded is an \"undocumented feature\", as officially [2] it wasn't in the list :) The data you are interested are in \"page_links_nt\" dumps (indeed dbpedia-owl:wikiPageWikiLink property is used) which can be downloaded from [3]. Cheers, Volha [1] [2] [3] On 10/20/2014 12:32 AM, James Fisher wrote:" "Test with DISTINCT and ORDER BY" "uHello, when i put the query" "Using Lookup Service." "uHi! I installed Lookup following this instructions: Running a local mirror of the webservice - But now I don't know how to use Lookup. I'm using Jena and allegrograph and I'm working on Linux. I found an example code here: I copied it to my Eclipse and I don't get errors on Eclipse, but I don't know how to use it and do'nt know if it's complete. I want to use the keyword method. Is there any tutorial? And what besides the Lookup do I have to install? I've done a bunch of research but haven't found anything clear. Please, help me. Thank you! Hi! I installed Lookup following this instructions: Running a local mirror of the webservice - you!" "Mappings statistics not working" "uDear All, The DBpedia mappings statistics page seems to not be working for a couple of days.When I tried to access the mapping statistics page for English I keep getting the following error message, ||Service Temporarily Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later|| Could you please fix it as soon as possible,and while you are at it, could you please also add Arabic(ar) to the mappings statistics page. Kind Regards, Haytham uThanks for the report! I'fixing it, should be up again in 10 minutes or so. On Wed, Mar 7, 2012 at 13:19, haytham alfeel < > wrote: uHaytham, It took a bit longer than I had hoped, but the server is back up now. I'll add the arabic mapping stats later today. Christopher On Wed, Mar 7, 2012 at 13:38, Jona Christopher Sahnwaldt < > wrote:" "Strange text strings in wiki data dump" "uHi guys, I am doing some work on analyzing wiki dumps. However, I confront with a headache problem that some text ( under ) seems to be malicious. It may only contains one dirty word and repeat again and again. What makes it worse is that some of such strings seem to be endless, which leads my parser to get stuck when reading it. I extracted such text to read under vim and vim shows that it has an exact number of lines. But when I click page down, it just cannot reach the end and get stuck into endless messy code. Have you ever confronted with such problem? Thanks a lot. Best regards, uCould you give us an example? Which file, which page title? On Apr 3, 2013 7:56 AM, \"Ning Zhang\" < > wrote:" "Redirects resolved in DBpedia 3.8?" "uIf what my tools tell me is right, it looks like DBpedia is now following redirects in the page_links file. Is this is deliberate decision? If so, that’s great, because it eliminates a critical step in doing any kind of processing out of that file. If what my tools tell me is right, it looks like DBpedia is now following redirects in the page_links file. Is this is deliberate decision? If so, that’s great, because it eliminates a critical step in doing any kind of processing out of that file. uOn Sat, Sep 8, 2012 at 4:10 PM, < > wrote: That's correct. We decided to resolve redirects in almost all datasets. More precisely, we first extract all redirect pages, then resolve transitive redirects and remove redirect cycles, and finally replace all link targets by their transitive redirects where necessary. If someone still needs the original data, it is available in files named _unredirected_, e.g. page_links_unredirected_en.nt.bz2 [1]. Thanks for the feedback! We also figured that for most applications, it makes more sense to resolve redirects, even if there is a small percentage of 'false positives' where a redirect doesn't actually resolve a synonym but points to a slightly different subject. Christopher [1]" "decimal and grouping separators doubt" "uHello everybody, I was having a look at DBpedia data about cities as for example the area total property. I would like to know how do you deal with different decimal separators and grouping separators between countries. For example I found that in km² and I think the \",\" is a decimal separator according to the Brazilian convention [1] . However in the DBpedia dataset I found the following value: 2.82569E11. My guess is that separator is being used as grouping separator as it is the convention in United Kingdom [2] for example. I would be very thankful if you can enlighten me. Cheers, María [1] [2] Hello everybody, I was having a look at DBpedia data about cities as for example the area total property. I would like to know how do you deal with different decimal separators and grouping separators between countries. For example I found that in separator according to the Brazilian convention [1] . However in the DBpedia dataset I found the following value: 2.82569E11. My guess is that separator is being used as grouping separator as it is the convention in United Kingdom [2] for example. I would be very thankful if you can enlighten me. Cheers, María [1] i_xfdl_r_formats_en_GB.html uHi Maria, thanks for the report! The problem is that the number is displayed with a comma as the decimal separator, but in the source text of the page [1], the decimal separator is a dot: | área = 282.569 The template [2] that generates the HTML from the source expects the number to use a dot and formats it appropriately for Brazilian/Portuguese: {{formatnum:{{{área}}}}} To fix this problem, we will have to extend our extraction framework, so that users can specify which decimal separator is used in the values of a certain template property. @developers: We will have to discuss what's the best way to do this - Add a configuration value decimalSeparator whose value may be dot or comma: \",\" or \".\". Bit hard to readWe would also need a configuration value groupSeparator. - Add a configuration value numberFormat that takes a language code, in this case \"en\". - Add a configuration value numberFormat that takes a decimal separator and a group separator: \".,\". Bit hard to read Any other ideas? JC [1] [2] [3] On Wed, May 30, 2012 at 1:06 PM, María Poveda < > wrote: uWhat about an 'statistical' approach? Most people will type number in their locale format, and the common pitfall is to use the English format. If the number format is correct English, and the statics say that most numbers are xx format, the number could be converted to the local format by using a conversion function. Every language has a number extractor to parse numbers in their locale, we could add this conversion function. -Mariano Any other ideas? What about an 'statistical' approach? Most people will type number in their locale format, and the common pitfall is to use the English format. If the number format is correct English, and the statics say that most numbers are xx format, the number could be converted to the local format by using a conversion function. Every language has a number extractor to parse numbers in their locale, we could add this conversion function. -Mariano u+1 to this. Accepted values: - \"dot\" or \".\" - \"comma\" or \",\" - \"space\" or \" \" (it is the case that groupSeparators are spaces sometimes) Cheers, Pablo On Wed, May 30, 2012 at 4:29 PM, Jona Christopher Sahnwaldt < uOn Wed, May 30, 2012 at 10:29 AM, Jona Christopher Sahnwaldt < > wrote: POSIX sorted this all out a couple of decades ago (and standardized it). Why not just use the infrastructure that they've made available (and reference that standard)? Note that they specify monetary and non-monetary number formatting separately. It may seem like overkill, but there's almost certainly a good reason for it (although I don't know what it is). Tom uHi Mariano, I'm not sure I understand correctly what you mean. I guess statistical approach could mean two things: 1. heuristically figure out which format is used on a Wikipedia edition and use that format to parse all values 2. heuristically figure out which format is used for a certain template property and use that format to parse values for this property I don't think either would work well. For example, most numbers in the source text of decimal separator (as Portuguese texts usually do), but most values in the source text of dot. So we can't use one separator for all pages of a language, we have to treat specific properties differently. And I don't think there is a good way to figure out which format a property uses. In this case, we would have to figure out that Rio Rufino has an area of about 282 km², not ~ 282000 km². We might use a heuristic like 'cities usually have an area of less than 10000 km²', but such a heuristic might fail for either very large or very small cities, and we would have to introduce all kinds of different heuristics. I think it's much simpler to allow the editors of the mappings wiki to specify that a certain template property uses a format that differs from the main one for this Wikipedia edition. Cheers, JC On Wed, May 30, 2012 at 4:57 PM, Mariano Rico < > wrote: uHi Tom, Java offers all these features in NumberFormat, DecimalFormat etc., and we use these APIs, for example in ParserUtils [1]. We use the appropriate format for each Wikipedia language, but some Wikipedias (for example, Portuguese, but there are others) do not consistently use the number format for their language. :-( Cheers, JC [1] On Wed, May 30, 2012 at 9:54 PM, Tom Morris < > wrote: uI think I tend towards the language code solution. It feels safer to use a different locale altogether than to modifiy separate aspects of a complex beast like a number format. Advantages: + Relatively simple to implement: add Locale constructor parameter in ParserUtils, use it when getting NumberFormat / DecimalFormat. Four data parser and three mapping classes will also need this constructor parameter. The MappingsLoader will have to parse the mappings wiki settings. For the other solutions, we would either need two (or maybe three) instead of one parameter, or construct our own NumberFormat from the decimalSeparator and groupingSeparator (and possibly other) property settings. + Compact setting: one setting like \"en\" changes all separators, not just the decimalSeparator. If decimalSeparator and groupingSeparator can be set separately, editors will probably forget one or the other, which will lead to problems. We could add implicit rules like \"if the editor set the decimalSeparator to comma but no groupingSeparator, set groupingSeparator to dot\" and vice versa, which is ugly and confusing. Disadvantages: - Possibly more difficult for mappings wiki editors. The difference between comma and dot is plain to see, but that these characters are connected to a thing called 'locale' is less obvious. - Less flexible. If there ever is a property that uses decimalSeparator, groupingSeparator and maybe other settings not covered by any Locale, we're in trouble. I think (but we should collect hard data about this) that there are mainly two cases: numbers that use comma as decimalSeparator (and dot or space as groupingSeparator), or vice versa: dot as decimalSeparator (and comma or space as groupingSeparator). Most template properties in a Wikipedia language edition should use the format that is 'canonical' for that language. The others will probably use the 'English' format. Space as groupingSeparator is already allowed for all languages. Java doesn't support it, but I recently added that behavior to ParserUtils.parse(). Regards, JC PS: of course, there are exceptions: I just found out that in the German Wikipedia, articles that start with uThis sounds good to me. Marking something to \"be parsed as you'd parse english numbers\" is easy enough to get. Cheers, Pablo On Fri, Jun 1, 2012 at 6:06 PM, Jona Christopher Sahnwaldt < >wrote: uOK, then I'll put this on my TODO list. I don't think this feature will make it into the 3.8 release though, as the main extractions are already done. On Fri, Jun 1, 2012 at 6:20 PM, Pablo Mendes < > wrote: uHello all, I've just seen that according to the ontology the areaTotal ( most of the values of the 3.7 version are km2. Here you can either extract the values according to this definition or change the ontology? It could make sense to extract it as km2 as the areas of populated places are usually quite big to be expressed in meters. María On Fri, Jun 1, 2012 at 7:01 PM, Jona Christopher Sahnwaldt < >wrote: uHi Maria, values for square metres (the SI unit) because it's easier to compare and order across many different resources. We also have the \"specific\" property kilometres because that's what's conventionally used for populated places. For example, dbpedia-owl:PopulatedPlace/areaTotal 891.85 dbpedia-owl:areaTotal 891850000.000000 I just looked at three or four other pages and they look ok to me. Do you have an example where dbpedia-owl:areaTotal uses square kilometres? Cheers, JC On Wed, Jun 6, 2012 at 5:46 PM, María Poveda < > wrote: uHi Christopher, .- from km² .- in the dbpedia dataset the areatotal is: 2.82569E11 m2 (from the 3.7 pt dbpedia version) If I'm not wrong it should be 2.82569E8 m2. María On Thu, Jun 7, 2012 at 5:18 PM, Jona Christopher Sahnwaldt < >wrote: uHi María, that's the problem we discussed last week. We basically know how to handle this problem, I just don't know yet when we will have the time to implement the solution. The Wikipedia page uses '.' as the decimal separator - not in the HTML view, but in the wikitext source: | área = 282.569 Because '.' is usually the thousands separator in Portuguese, we extract 282569.0 km² and 2.82569E11 m² for Rio Rufino: It's quite possible that there are other kinds of errors in the DBpedia extraction, so if you find other mistakes let us know. Please check the wikitext source of a Wikipedia page. Often, the cause of the error can be found there. JC On Fri, Jun 8, 2012 at 11:48 AM, María Poveda < > wrote: uI think Maria found that this is not consistently the case. It seems that some pages use . others use , Is this right, Maria? Or is this between English and Portuguese? Cheers, Pablo On Fri, Jun 8, 2012 at 12:29 PM, Jona Christopher Sahnwaldt <" "A clear case of a missing label" "uI find this rdf:type in the latest dumps, but he doesn't have a label [ dbpedia_3.5.1]$ bzgrep Hidehiko_Shimizu labels_en.nt.bz2 [ dbpedia_3.5.1]$ There's a page about this guy in Wikipedia that looks pretty normal, except for the fact that the deletionists want it to go away: What's up here? It seems to me that there ought to be some kind of acceptance tests done on the dumps so that we know the key structure makes sense. For the last few versions of dbpedia, it's been (mostly) true that assertions are only made about things that (i) have a label, or (ii) are the subject of a redirect. The exception to that has been that there are wikilinks to pages that don't exist, precisely because there ~are~ wikilinks to pages that don't exist. I can see that some good may come out of extracting Engines out of automobile descriptions and PersonFunctions out of persons, but this one just looks like a glitch." "PhD Scholarship Opportunity on Visualisation of Expressive Ontologies at Monash University, Australia" "uApologies for cross posting. ==PhD Scholarship Opportunity on Visualisation of Expressive Ontologies at Monash University, Australia== The Opportunity Applications are invited for a funded PhD scholarship position at Monash University, Australia. The successful candidate will perform research in the junction of information visualisation, bioinformatics, and the Semantic Web. The candidate will work under the supervision of Prof. Falk Schreiber and Dr. Yuan-Fang Li at the Faculty of Information Technology, Monash University. The Gene Ontology (GO) [1] is one of the most widely used biomedical ontologies. It comprehensively describes different aspects of genes and gene products under three broad categories, (1) biological process, (2) cellular components and (3) molecular functions. GO is currently being actively extended in the LEGO project to create a new annotation system to provide more extensible and expressive annotations. This project aims to investigating novel visualisation techniques for GO-LEGO and its integration with the Systems Biology Graphical Notation (SBGN) [2]. The information can be seen as networks, and although there are some solutions for network layout, none is particularly tailered towards this application. Specifically we will investigate how domain knowledge captured in ontologies can assist in creating high-quality layouts. [1] Ashburner, Michael, et al. \"Gene Ontology: tool for the unification of biology.\"Nature Genetics 25.1 (2000): 25-29. [2] Le Novère et al. \"The Systems Biology Graphical Notation\" Nature Biotechnology 27 (2009): 735-741. Candidate Requirements The successful student will have an excellent academic track record in computer science, bioinformatics or a related discipline. Applicants are expected to possess very good programming and analytical skills, be able to work independently as well as cooperatively with others. Research experiences in ontologies, knowledge representation, bioinformatics or information visualisation would be highly advantageous. Candidates will be required to meet Monash entry requirements which include English-language skills. Details of eligibility requirements to undertake a PhD are available at Remuneration The scholarship will be awarded at the equivalent of Australian Postgraduate Award (APA) rate, which is currently $25,392 per annum full-time (tax-free). The scholarship is tenable for three years, and extendable for up four years, dependent on satisfactory progress and availability of funding. Organisational context Monash is a university of transformation, progress and optimism. Our people are our most valued asset, with our academics among the best in the world and our professional staff revolutionising the way we operate as an organisation. For more information about our University and our exciting future, please visit The Faculty of Information Technology ( It is one of the few faculties of information technology in the world and one of the largest academic information technology units within a tertiary institution in the world. Its research-intensive, multidisciplinary, international capabilities provide it with a set of exciting teaching, research, and engagement opportunities that position it uniquely within the tertiary sector. The Faculty has a strong commitment to providing the highest-quality learning opportunities for its students. The Faculty's research degrees are designed to produce high-calibre scholars who are capable of cutting-edge pure or applied research in either academe or practice. The Faculty strives to undertake high-quality, high-impact research. It has chosen to focus its efforts through four research flagship programs, namely: * Computational Biology * Data Systems and Cybersecurity * IT for Resilient Communities * Machine Learning * Modelling, Optimisation & Visualisation These flagships provide a highly focused research direction in particular areas where the Faculty has significant and internationally recognised research expertise. In addition, they inform the teaching activities of the Faculty. Submitting an Expression of Interest (EOI) Interested applicants should submit the following: * The expression of interest (EOI) form ( * A resume in PDF containing tertiary academic track record and achievements, research and work experiences (if any), and contact details of 3 referees, * Academic transcripts, and * A brief statement explaining your motivation and suitability for undertaking PhD research on this topic. Enquiries For more details about the project and the scholarship please contact: * Prof. Falk Schreiber ( ) * Dr. Yuan-Fang Li ( )" "Postcode data" "uApologies if this covers old ground but I am new to DBpedia and after a cursory hunt around the archive nothing quite answered my question. I was searching through the DBpedia?wikipedia data set and entered the UK postcode N194EH, at present there is no entry fro this. Is there away to enter this information direct to DBpedia ? or does it have to go through Wikipedia ? If so when is the next download from Wikipedia ? cheers for any help you can give. Kevin uHI Kevin, DBpedia is a community effort to extract structured information from Wikipedia and to make this information searchable on the Web. About two times a year the teams extract all the data from wikipedia and publish this as semantic data on However we now also have a live instance where the latest article changes from wikipedia are processed. This means that if you change an article in the English wikipedia, these changes in as far as they are mapped to the semantic will be merged into the live instance within minutes. Currently there are two sparql endpoints for this effort and Patrick uHi all, just a short note: UK postcode related information as linked data is provided by UK Ordinance Survey at John Goodwin (@gothwin on Twitter) can help. Kind regards, Daniel On Wed, Nov 9, 2011 at 9:24 PM, Patrick van Kleef < >wrote: uOn 11/9/11 2:59 PM, kevin carter wrote: uHello thanks for your help with the postcode query. The reason I ask is I have over the past five years been making a digital public artwork called www.landscape-Portrait.com. This has involved me and various collaborators working with communities and encouraging them to record their own profile of an area as a counterpoint to that provided by postcode demographics - see www.landscape-portrait.com. I am now in the position of having an archive of geo stamped video portraits of various locations in the UK and I wanted to link these to postcode. My first thought was to do this via Wikipedia/DBpedia; does this sound plausible ?, I am somewhat of a newbie to the world of linked data. Any thoughts of advice would be appreciated, Kevin Carter. On Wed, Nov 9, 2011 at 8:34 PM, Daniel Koller < > wrote: uOn 10/11/2011 12:43, kevin carter wrote: I think the Ordnance Survey postcodes Linked Data should work for you. The URLs are constructed from the postcode itself in a logical way, e.g. my postcode of RH15 8JA has this URL: so you don't need a fancy lookup service to find out what the correct URL is for a given postcode. If you access this URL in a browser you will see that there is a lot of useful information, including lat/long and administrative areas. Access it as RDF (by content negotiation - or by sticking '.rdf' on the end!) and you get back a machine-processible version of this information. Presumably you want to create a linked data description of each of your videos which includes the postcode? Is the intention to put these descriptions up on Wikipedia, or will you be creating your own online resource? Richard" "Extraction Framework Parallelisation" "uHi all, are there any efforts or strategies for parallelised extraction? The extraction framework obviously works single threaded which makes it a pretty long running process. Without to many changes to the code, would it be possible to split the pages-articles.xml source and run multiple instances on these shards? Ideas and solutions are welcome. :) Best Magnus uOn 16 October 2013 18:00, Magnus Knuth < >wrote: No it doesn't. With the default settings, it should use all available CPU kernels. What makes you think so? which makes it a pretty long running process. us/kernel/core/g On 16 October 2013 20:10, Jona Christopher Sahnwaldt < >wrote:" "Graph URI - dotNetRDF" "uHi When I execute on DBpedia the following query: \"SELECT DISTINCT ?g WHERE { GRAPH ?g {?s ?p ?o} }\" result contains entries like: - b3sonto, - b3sifp, - dbprdf-label, - virtrdf-label - facets which seem to be invalid URI for dotNetRDF library. Is it a bug in DBpedia or maybe a problem with dotNetRDF ? Best Regards Przemek Misiuda" "Wikipedia category browser" "uHi, DBpedians. While we've been working on Freebase (www.freebase.com) we've built some tools to help sift through Wikipedia categories looking for ones that label articles as of a given type. We try to type as many Freebase topics as possible, so a common question we ask is of the form \"what are all the categories that list Wikipedia articles about people?\" We're making our category browser tool available so that other groups can easily search and navigate through Wikipedia categories. The browser lives here: Try a query like \"biography\" to see how it works. I hope you find this useful. If you have any questions, please let me know. And thank you in advance for any feedback. Note that we've only loaded it with categories from the English language Wikipedia. The data is current as of 23 Apr 2007. uPatrick Tufts wrote: Pat, Conversely please take a look at the following: 1. 2. (equivalent of entering \"Venture Capital\" into your demo) 3. 4. An RDF Browser view of Sillicon Valley Companies There is a best practice at work in all of the demos: Data Link Traversal via URI Derferencing. This basically taking advantage of what RDF and SPARQL based infrastructure deliver. uHi Pat, thanks for the link to the tool, which is definively useful for manually classifiying Wikipedia categories. In the last weeks, we had a look at the YAGO classification and included some of it into DBpedia. This classification is used by the DBpedia search interface and we will make it available for download shortly. We also had a student trying to manually build a class hirarchie ontop of the YAGO classes (not perfect yet, but clearly a start). The results of this work will also go online shortly. Pat: Are you at WWW2007 in Banff? It would be great to meet and discuss how we could align our effort a bit more closely. There is also a F2F meeting of the Linking Open Data project at Banff. See: It would be great if you would join the meeting and we could hear your opinion on our project. Cheers Chris uPat, Thanks much for this, I like it a lot. Combined with the links Kingsley gives, we've got a variety of views into the data. That's especially exciting to me because of my background in teaching. The three columns are appealing from that perspective. When I used to teach basic critical thinking skills to first-year university students, I spent a lot of time trying to develop an intuition for moving between particulars and abstractions both as a skill for analysis and as a skill for research. The interface here fosters moving up and down that abstraction ladder nicely. If I were still teaching, I'd take my students here first, then take the more advanced students over to Kingsley's links to work on more precise relationships between data. This is powerful stuff, thanks. Patrick Gosetti-Murrayjohn Instructional Technology Specialist University of Mary Washington Patrick Tufts wrote: Pat, Conversely please take a look at the following: 1. 2. (equivalent of entering \"Venture Capital\" into your demo) 3. 4. An RDF Browser view of Sillicon Valley Companies There is a best practice at work in all of the demos: Data Link Traversal via URI Derferencing. This basically taking advantage of what RDF and SPARQL based infrastructure deliver. uOn 4/30/07, Kingsley Idehen < > wrote: Are the common best practices articulated briefly somewhere? For example apart from traversal via uri dereferencing is content-negotiation part of the \"best practices\"? I only ask because I'm about to experiment w/ making some data at the Library of Congress available in a minimalist way and I'd like to follow best practices where I can. //Ed uHi Pat, I, too, found this approach very interesting. I like the progressive drill down in the three panels. However, what this did bring more to mind was the weakness in the overall Wikipedia structure for such approaches. In fact, I wrote a piece on that today, The DBpedia folks are also dealing with this categorization/organizational issue via YAGO and other means, according to Chris' response and other threads. I'll be following this thread with keen interest. I'm also personally interested in hearing more about how you are using MQL for this, as well as any SPARQL - MQL comparisons. If you have any spare time ( :) ! ), I'd like to see a posting to this mailing list on that. Good work and thanks for stimulating some thought! Mike Chris Bizer wrote: uHi Michael, To clarify a bit uHi Robert, Well, as is not unusual, I have gotten some important portions of this story wrong. I apologize for my errors. While I'm not wanting you to \"pump\" my blog, I do suggest you submit the substance of your following email to my posting as a comment. Since the blog feed has been issued, it is nigh impossible to retract errors. I'm extremely interesting in having a canonical reference structure of topics/subjects to which third parties can bind or reference. I'd be happy to work with you in whatever way possible to help get such a structure in place. Thanks, again, Robert. Mike Robert Cook wrote: uHi Robert, Actually, it is not right to ask you to make the effort to post a comment correction to my own blog. You have already brought the information to my attention. I will therefore submit my own comment correcting the information as you point out below. I therefore only ask that I be able to quote and reference you. Mike Michael K. Bergman wrote: uHi Michel, Robert and Pat, I think the open, community approach that Freebase is using to build such a structure is very promising. Once Freebase is public and this structure has envoled by community editing, I think we should merge the Freebase classification schema with the DBpedia data. This structure doesn't have to replace the YAGO stuff we are currently doing, as a nice feature of the RDF data model is that you can have several alternative class hirarchies at the same time and users can choose the one that fits their needs for a specific application. Bringing the Freebase classification into the Semantic Web by serving it as Linked Data with dereferencable URIs would also allow third parties to reference classes and bind external data to Freebase classes. Serving the Freebase classification on the Semantic Web could be done by a similar architecture as we are using for the RDF Book Mashup get a live view on the Freebase database. This is all very interesting stuff :-) Chris uRobert, Thanks for all of the info; corrections have been made. Mike Michael K. Bergman wrote: [snip]" "Wikipedia disambiguation pages" "uI'm working with the Wikipedia data dumps to do some data processing. I know that a page that contains \"{{Disambig}}\" is considered a disambiguation page. But apparently there are many other tags that also can be used for marking disambiguation pages, such as {{disambig-cleanup}}, {{airport disambig}}, {{Geodis}} etc. I found these examples on Does anyone else here know, what is the full list of these disambiguation templates? Or how can I generate a full list of disambiguation pages? Also, if I work with the international data dumps, they have other tags (in respective language). So just looking for \"{{Disambig}}\" would not work, I would need this tag for each language. How can I solve this if I want write a script that detects all disambiguation pages for other languages. Also, some pages start with a prefix such as \"Template:\", \"User:\", \"List_of_\", \"Wikipedia:\", \"Image:\" etc. I would like to avoid process pages with these type of prefixes. And I would like to do it for all languages. Is there a list with of all these prefixes (both for English and foreign languages)? If not I can always write a script that detects what prefixes are frequently occurring for each language, but I thought there might be a more formal way of getting a full list of these type of prefixes. uOn Mon, Oct 20, 2008 at 8:13 AM, Omid Rouhani < > wrote: The list is: This is the list that mediawiki uses to generate the list of pages linking to disambiguation pages. It should probably exist on all language projects. A quick check of French and German both have it. (the wonderful Germans, of course only have one clean template it looks like) :D Many of those are namespaces, which are available from the database I would think. (I don't know how you're getting the data) I probably can't help much here, but here is something to read :) Judson User:Cohesion" "how to use ConstantMapping" "uHi! Should I ask usage questions on specific mapping templates here, or in the wiki? I'd prefer to ask my question here: so the answer is recorded for others to see, but I'm not sure people are reading there. 1. How to specify a URL? The doc says Please provide decoded URIs here, i.e. specify \"Billy Murray (actor)\" instead of \"Billy_Murray_%28actor%29\". We see an example that follows the doc: {{ConstantMapping | ontologyProperty = industry | value = Air Transport }} But the mapping check reports \"BAD URI: Illegal character in path at index 3: Air Transport\" 2. How did this work? But I can't figure out where this comes from. - The America West Airlines page does use the Infobox \"airline\", so it should use the above mapping - The mapping wiki doesn't have the word \"Aviation\" at all: 3. How to specify a URL to Canonical dbpedia? For some bg mappings, we want to specify a ConstantMapping dbo:gender=dbp:Male. We don't want the value to be in the bg.dbpedia.org namespace. Cheers! uHi Vladimir, Thanks for reporting this. The \"BAD URI\" was a bug that should be fixed with [1]. (See inline) On Fri, Dec 5, 2014 at 5:34 PM, Vladimir Alexiev < > wrote: This is the correct way, after me merge & deploy the pull request it should work as described again. Looks strange, I guess a mapping / wikipage revision probably had that information and ended in the 2014 release Any absolute URI will work ( DBpedia URIs when we find only text Cheers. Dimitris [1] https://github.com/dbpedia/extraction-framework/pull/288 udbpedia.org/resource/Air_Transport redirects to dbpedia.org/resource/Aviation. So I guess ConstantMapping respects the redirects: nice !! But for clarity, could you edit the above mapping to use Aviation directly? Thanks! Could you please add this to the doc page I still don't have editor rights." "DBpedia Lookup service license" "uHi all, what is the license for the DBpedia lookup service code hosted on Github [1]? Cannot see any license file. Regards Andrea [1] Hi all, what is the license for the DBpedia lookup service code hosted on Github [1]? Cannot see any license file. Regards Andrea [1] lookup uOn Fri, Mar 22, 2013 at 11:40 AM, Andrea Di Menna < > wrote: It's Apache 2.0. I added a license file. Cheers, Max" "problems with Virtuoso Sesame Provider" "uHello everybody, We have a problem with Virtuoso Sesame Provider. We get a NullPointerException, when we try to read the result (result.hasnext()), but only if we have a OPTIONAL in the query: SELECT distinct ?c ?l WHERE {{?c rdf:type owl:Class}. OPTIONAL {?c rdfs:label ?l}. FILTER (isURI(?c)) } uHi Nico, You should pose this question on the Virtuoso Users mailing list rather than the DBpedia mailing, thus please subscribe the the Virtuoso mailing lists as detailed at: What Virtuoso release are you running when encountering this issue ? We also have a Jena Provider which is Java based, but would like to determine if their is an issue with the Sesame provider and resolve it, so if you can provide steps to reproduce that would be useful Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 20 Apr 2010, at 15:27, N.Stieler wrote:" "Non English Articles (with no English equivalent)" "uHi all, Hopefully a simple question N Free games, great prizes - get gaming at Gamesbox. { margin:0px; padding:0px } body.hmmessage { FONT-SIZE: 10pt; FONT-FAMILY:Tahoma } Hi all, Hopefully a simple questionFrom what I can tell if an non English article does not have an English equivalent then this article does not appear in DBPedia, can any one inform me if this is correct for version 3.0? N She said what? About who? Shameful celebrity quotes on Search Star!" "Improper or inconsistent data after extraction" "uHi All, We are getting improper and inconsistent data for many records.Some issue which i have faced till yet are 1- Some person place of birth is not proper. It is having an integer value. 2- Sometimes we are getting birth date data pointed from \"placeOfBirth\" and sometimes from \"birthPlace\". 3- Sometimes birthDate is not proper. \" < http://www.w3.org/2001/XMLSchema#int> . < http://dbpedia.org/property/placeOfBirth> \"1\"^^< http://www.w3.org/2001/XMLSchema#int> . < http://dbpedia.org/property/placeOfBirth> \"1\"^^< http://www.w3.org/2001/XMLSchema#int> . < http://dbpedia.org/property/placeOfBirth> *\"1\"^^*< http://www.w3.org/2001/XMLSchema#int> < http://dbpedia.org/property/birthPlace> *\"1\"^^*< http://www.w3.org/2001/XMLSchema#integer> . < http://dbpedia.org/property/birthPlace> \"2\"^^< http://www.w3.org/2001/XMLSchema#integer> . *\"2\"^^*< http://www.w3.org/2001/XMLSchema#integer> . < http://dbpedia.org/property/birthPlace> \"2\"^^< http://www.w3.org/2001/XMLSchema#integer> . < http://dbpedia.org/property/birthPlace> \"3\"^^< http://www.w3.org/2001/XMLSchema#int> . < http://dbpedia.org/property/birthPlace> \"3\"^^< http://www.w3.org/2001/XMLSchema#int> . \" How to deal with such inconsistencies of the data. uProperties in the of Wikipedia template properties, which are not very consistent and clean. Properties in the See Cheers, JC On 9 April 2013 11:46, gaurav pant < > wrote: uHi Gaurav, On 04/09/2013 11:46 AM, gaurav pant wrote: All of these triples are extracted by the \"InfoboxExtractor\". The \"MappingExtractor\" extracts much better and more accurate data as discussed in that thread [1]. uSorry I forgot the link On 04/09/2013 02:56 PM, Mohamed Morsey wrote:" "Your subscription request" "uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken. uHello, Someone has requested to \"opt in\" to our mailing list(s) Addresses using this email address: To prevent abuse of the system, we are sending this confirmation message to you to let you decide if this is what you want. If you wish to subscribe, please click on the following link (or copy and paste it to your browser): If you do not wish to subscribe, just ignore this message and no action will be taken." "infobox mappings case sensitive?" "uDear DBpedia team, Could you please clarify whether/why Wikipedia infobox names mappings are case sensitive? To transclude a template into an article or page, type {{template name}} in the wikitext at the place where the template is to appear. The first letter may be indifferently lower- or upper-case. Yet, dbpedia:Lufthansa gets rdf:type dbpedia-owl:Airline while dbpedia:Wardair does not. Many thanks in advance, Pavel uDear Pavel, yes, template mappings are case sensitive, except from the first letter of the template Best, Dimitris On Mon, Jul 6, 2015 at 5:02 PM, Pavel Smrz < > wrote:" "Virtuoso Instance and LOD Cloud Hosting Update" "uAll, At the current time here is what we have: 1. DBpedia 2. NeuroCommons 3. Bio2Rdf 4. Musicbrainz 5. PingTheSemanticWeb (the root for many Semantic Web search engines e.g. Falcons and Sindice) 6. Uniprot 7. Many more to come as part of our effort to put the entire LOD cloud into a Virtuoso 6.0 Cluster Edition. Links: 1. inside Virtuoso) 2. Cache is pretty cold, so a few query hits are required for general warm up. Also, we haven't set the 2-3 second response time ceiling, so any queries presented right now are executed without the \"anytime query\" feature. We've decided not to impose the server side ceiling until we've update the UI. Once done, users (UI or Web Service API) can then configure their response time factors within a floor and ceiling range set on the server. Thus, the moment you see the options in the UI you can safely assume the response time floor and ceiling settings are in place on the server. For those interested in URI lookups, as per my last mail about this matter, you can search on: Telemann, then use Types to filter out the types of entities associated with the pattern and then use properties to get you to your desired entity/object/resource which will naturally expose a URI / Entity ID. You can also search on \"Napoleon\" and ultimately see a map (note. the navigation section drop-down) that will plot locations of conflicts associated with the entity \"Napoleon\"." "DBpedia Outgoing Links submission" "uHi all, we are working on reactivating the DBpedia Links repo, which was a repository to submit outgoing links from DBpedia. The repo is moved here now: We are still working on a good way to submit links, but I guess there is no perfect solution uHi Sebestian, the git repo works for me: I have just made my first pull request ;) best, diego 2015-05-07 17:07 GMT+02:00 Sebastian Hellmann < >:" "problem with file wikipedia_links_ru.nt.bz2" "uDear all, I've got \"Unexpected end of archive\" when extracting wikipedia_links_ru.nt.bz2. Could someone fix original file on 3.51 DBPedia download page, please? Best, Vladimir Ivanov uHello Vladimir, Vladimir Ivanov wrote: Thanks for reporting this. The file should be fixed now. Kind regards, Jens uThanks a lot, Jens! I've downloaded the \"nq\" version of the dataset and it was OK." "Sparql query for getting parent YAGO categories of more than 1 level" "uHi folks, I want to extract parent yago categories upto 4 levels . For eg : yago category as ' Similarly this parent category has ' Using PREFIX yago: PREFIX rdfs: SELECT ?title WHERE { rdfs:subClassOf ?title. } I get 1 level parent . How to get parents of more than 1 level i.e. rdfs:subClassOf transitivity Please help u0€ *†H†÷  €0€1 0 + uHi Somesh, maybe this is helpful for you (in particular the last example in the answer): Best, Heiko Am 22.01.2013 14:51, schrieb Somesh Jain: uHi Somesh On 01/22/2013 02:51 PM, Somesh Jain wrote: This thread discusses what you want to achieve [1]. Hope it helps." "Japanese DBpedia SPARQL endpoit" "uI have been trying to use the Japanese DBpedia SPARQL endpoint ( most of the time and when up, it times out very quickly. Is there any alternative to query the Japanese DBpedia? anpther SPARQL endpoint of some sort? Regards I have been trying to use the Japanese DBpedia SPARQL  endpoint ( Regards uAs an example the following query times out. select ?s group_concat( ?label;separator='|'){ values ?sType { dbpedia-owl:Song dbpedia-owl:Single } . ?s a ?sType ; rdfs:label ?label } limit 50 On Thu, Aug 28, 2014 at 9:03 AM, Hamid Ghofrani < > wrote: uHi Hamid, The public endpoints are limited in resources and execution time, and more costly queries will time out. The alternative would be to set up your own endpoint following tutorial[1], with the difference that you would only import the Japanese dataset [2] . Cheers, Alexandru [1] [2] On Thu, Aug 28, 2014 at 3:07 PM, Hamid Ghofrani < > wrote: uHi Hamid, I restarted Virtuoso of DBpedia Japanese yesterday, then it worked better now. As Alexandru have said, we recommend you to configure your own endpoint with Japanese datasets if you use heavily. Best, Fumi On Thu, Aug 28, 2014 at 10:03 PM, Hamid Ghofrani < > wrote:" "Extracting German noun forms" "uregarding the forms: currently that is not part of the dataset yet. And unfortunatley its not very easy to add it. I think it even would require some enhancement to the extractor (not just the config). But its on my todo listHowever such \"boxes\" of word forms are probably easier to extract with the default DPpedia infobox extractor. Maybe the DBpedia community could help with that. The biggest problem there would be to determine the right \"context\" (i.e. the subject URI)i crossposted this to DBpedia, so they can reply Regards, Jonas Am Freitag, den 01.06.2012, 12:08 +0200 schrieb Lars Aronsson: uSome thoughts On Fri, Jun 1, 2012 at 5:29 PM, Jonas Brekle < > wrote: I think you don't really need to enhance your extractor. Just run the DBpedia MappingExtractor in addition. You could do the following: - set up a mappings wiki for DBpedia Wiktionary (1) - add a mapping for {{Deutsch Substantiv Übersicht}} to the mappings wiki - during the DBpedia Wiktionary extraction, also run a MappingExtractor instance that uses the mappings from your mappings wiki (1) or add namespaces to the existing mappings wiki - although it's getting a bit crowded as far as namespaces are concerned :-) As far as I can tell, DBpedia Wiktionary currently only has subject URIs for words from en.wiktionary.org, right? So you'd probably have to add URIs like I don't know if properties like \"Nominativ Singular=das Haus\" should be extracted as URIs or as literals. Just my 0.02€I don't know much about DBpedia Wiktionary. Of course, we could somehow work this into the main DBpedia extraction, but I think the solution outlined above makes more sense. Cheers, JC" "Mappings (hr)" "uHi, I need some information about the Croatian DBpedia. I did some template mappings and now I am interested in the following procedure. I would like to know what I have to do after the template mappings? Thank you in advance, Ivana Hi, I need some information about the Croatian DBpedia. I did some template mappings and now I am interested in the following procedure. I would like to know what I have to do after the template mappings? Thank you in advance, Ivana uHi Ivana, now that there are mappings for Croatian, you can run the DBpedia extraction. Several things have to be installed and configured, which is documented here: Section 1 describes what has to be installed to run the DBpedia extraction framework. In 4.1., all things that must be specified before starting the extraction from a dump file are listed. In the file \"dump/config.properties\" (using the file \"dump/config.properties.default\" as a template), you can specify languages=hr extractors.hr=org.dbpedia.extraction.mappings.MappingExtractor When you run the extraction (see 4.2.), the MappingExtractor will extract the information from the infoboxes that you created a mapping for. The extracted triples will be saved in a file named \"mappingbased_properties_hr.nt\" in the output directory you specified. Cheers, max On Fri, Jul 9, 2010 at 11:29 PM, Ivana Sarić < > wrote:" "dbpedia URI not consistent with Wikipedia URI" "uHi, It surprised me that a dbpedia URI is not consistent with its corresponding Wikipedia URI. This is URI in dbpedia is need resolve this issue because i found it break link of data. For example, from its dbpedia-owl:architect is dbpedia:John_Augustus_Roebling. However, when I query the rdf:type of dbpedia:John_Augustus_Roebling using SPARQL endpoint, it gave me no result. The reason is that there is no dbpedia:John_Augustus_Roebling but instead dbpedia:John_A._Roebling. I don't know how many else such URIs exist. Best regards, Lushan Han uHi, The wikipedia article about John_Augustus_Roebling (1) redirects to John_A._Roebling (2) that is why you cannot find any information for (1) the Brooklyn Bride article has a link on the redirection article Although this is not an a bug, it could be resolved in the extraction framework and replace all redirections to the proper articles. A shell script could do the job, any ideas / comments? Cheers, Dimitris On Tue, Apr 12, 2011 at 11:22 PM, Lushan Han < > wrote: uHi Dimitris, I am afraid that you did not completely see my point. It is not simply a redirection problem. For example, if I want to make a SPARQL query uMaybe what Dimitris says is that this query would indeed be answered if: - redirects were treated as sameAs and inference was used (works for this but not all cases) - the framework used redirects to do identity resolution at extraction time Also, i should point out that you can probably sort this problem out with a simple Silk link spec. Cheers Pablo On Apr 13, 2011 3:12 PM, \"Lushan Han\" < > wrote: uI like the second approach uYep. The disadvantage is that it is intrusive (requires access to DBpedia extraction). Luckily, DBpedia is an open source project to which any of us can contribute. Better yet, you can adapt similar code from DBpedia Spotlight into a DBpedia extractor and contribute it to the project. It should be in: org.dbpedia.spotlight.util.SurrogatesUtil.scala ( ) I will make sure to bug the leader of the next release to include it. :) Cheers, Pablo On Fri, Apr 15, 2011 at 2:43 PM, Lushan Han < > wrote: uA new extractor will be too expensive i think a script can do the job just fine it will have the redirects.nt as a look-up table and replace all occurrences in the extraction dumps cheers, Dimitris On Fri, Apr 15, 2011 at 4:10 PM, Pablo Mendes < > wrote: uI was thinking it could slowly evolve to a sort of DBpediaResourceFactory class at the core of the workflow who knew everything about transforming Wikipedia Page URLs into DBpedia Resource URIs/IRIs (including language-specific knowledge, redirects, etc.) But, yes, sure. Your solution sounds simple and efficient. :) Keep in mind that redirects.nt may need some treatment to compute the transitive closure (A redirects_to B redirects_to C -> A redirects_to C). Cheers, Pablo On Fri, Apr 15, 2011 at 3:45 PM, Dimitris Kontokostas < >wrote: uthe DBpediaResourceFactory seems better but redirects can be very big (only the English one is ~750MB). I don't know how the framework can handle such big data You're right about the transitive issue, I haven't thought of it :) you just found a bug in a new script i am creating :) anyway, this can be worked out (somehow I guess) Cheers, Dimitris On Fri, Apr 15, 2011 at 5:57 PM, Pablo Mendes < > wrote: uJust committed a script that resolves redirects for URIs in object position in .nt and .nq files. Cheers, Max On Fri, Apr 15, 2011 at 17:23, Dimitris Kontokostas < > wrote:" "Draft paper submission deadline is extended: EISWT-10, Orlando, USA" "uIt would be highly appreciated if you could share this announcement with your colleagues, students and individuals whose research is in enterprise information systems, information technology, e-commerce, web-based systems, data-mining and related areas. Draft paper submission deadline is extended: EISWT-10, Orlando, USA The 2010 International Conference on Enterprise Information Systems and Web Technologies (EISWT-10) (website: areas of Enterprise Information Systems, Enterprise Solution Systems, Databases as well as Web Technologies. The conference will be held at the same time and location where several other major international conferences will be taking place. The conference will be held as part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to promote research and developmental activities in computer science, information technology, control engineering, and related fields. Another goal is to promote the dissemination of research to a multidisciplinary audience and to facilitate communication among researchers, developers, practitioners in different fields.The following conferences are planned to be organized as part of MULTICONF-10. * International Conference on Artificial Intelligence and Pattern Recognition (AIPR-10) * International Conference on Automation, Robotics and Control Systems (ARCS-10) * International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) * International Conference on Computer Communications and Networks (CCN-10) * International Conference on Enterprise Information Systems and Web Technologies (EISWT-10) * International Conference on High Performance Computing Systems (HPCS-10) * International Conference on Information Security and Privacy (ISP-10) * International Conference on Image and Video Processing and Computer Vision (IVPCV-10) * International Conference on Software Engineering Theory and Practice (SETP-10) * International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) MULTICONF-10 will be held at Imperial Swan Hotel and Suites. It is a full-service resort that puts you in the middle of the fun! Located 1/2 block south of the famed International Drive, the hotel is just minutes from great entertainment like Walt Disney World® Resort, Universal Studios and Sea World Orlando. Guests can enjoy free scheduled transportation to these theme parks, as well as spacious accommodations, outdoor pools and on-site dining — all situated on 10 tropically landscaped acres. Here, guests can experience a full-service resort with discount hotel pricing in Orlando. We invite draft paper submissions. Please see the website more details. Sincerely John Edward" "Question about run dbpedia Extraction-Framework in windows 7" "uDear All, I would like to ask how can I install or run dbpedia Extraction-Framework in windows 7. Actually I tried to install it by using Ubuntu, but unfortunately it does not work with (i.e. there are some steps that do not work with me, I do not know why? and I'm not expertise with Linux). Actually I prefer to work on windows environment, so is it possible to run dbpedia Extraction-Framework on windows? if its possible so could you kindly explain to me step by step how can run it please? By the way I would like to inform the owner of Dbpedia there is a link does not open I found it while I was trying to run dbpedia on Ubuntu , the link is : Please make sure the patch Many thanks for your help. With Regards Abdullah Dear All, I would like to ask how can I install or run dbpedia Extraction-Framework in windows 7. Actually I tried to install it by using Ubuntu, but unfortunately it does not work with (i.e. there are some steps that do not work with me, I do not know why? and I'm not expertise with Linux). Actually I prefer to work on windows environment, so is it possible to run dbpedia Extraction-Framework on windows? if its possible so could you kindly explain to me step by step how can run it please? By the way I would like to inform the owner of Dbpedia there is a link does not open I found it while I was trying to run dbpedia on Ubuntu , the link is : Please make sure the patch Abdullah" "property/batAvg vs property/battingAverage" "uHi, I'm kinda new to Dbpedia and semantic web as such. I'm trying to get the predicate of my query, so I'm trying to do a string match from the natural language query that I have. Eg: Q: \"What is the batting average of Sachin Tendulkar\" I use this query to find the predicates : SELECT ?predicate ?label WHERE {{ ?predicate ?label . ?predicate ?propertyType. ?label bif:contains '\"batting average\"'} filter ( ?propertyType = < ?propertyType = < ?propertyType = < } limit 30 I got this, when I ran the query: Predicate: label: \"batting average\"@en The result is agreeably, but when I see the page of I see that he has a property called not And the same goes with the http://dbpedia.org/property/odishirtNo. I'm pretty sure I'm doing a mistake somewhere, but not sure where :| So my question is : - Is this the only way to match a string with a property? ( the query way ) - Why do we have 2 properties for the same purpose ? Thanks! P.S: Please do let me know if this question is not appropriate for THIS mailing list. I'm not sure if I can ask questions about finding the predicate and all. uHi Prashanth, On 03/20/2013 06:11 PM, Prashanth Swaminathan wrote: Resource \"dbpedia:Sachin_Tendulkar\" uses property \"dbpprop:batAvg\" which has label \"bat avg\" not \"batting average\". So, you should simply use \"bat avg\" in your query as follows: SELECT ?predicate ?label WHERE {{ ?predicate rdfs:label ?label . ?predicate rdf:type ?propertyType. ?label bif:contains '\"bat avg\"'} FILTER ( ?propertyType = owl:DatatypeProperty || ?propertyType = owl:ObjectProperty || ?propertyType = rdf:Property ) } LIMIT 30 With regard to the properties, that thread might clarify the issue [1]. So, in order to standardize the usage of properties we managed to build our mappings wiki [2]." "Checking out the DBpedia Extraction Framework" "uHi all I tried to download the extraction framework following this tutorial but got an error when trying to in Eclipse: \"File\" >> \"New\" >> \"Project\" >> Mercurial >> \"Clone Existing Mercurial Repository\" with   is this still supposed to be working? Sorry if I missed something obvious. Regards, Felix uHi Felix, What happens from command line? hg clone Also, I'd personally suggest going with IntelliJ IDEA since the scala plugin there seems to work better. I think Jona uses Eclipse, though. Did you try the trick \"Right-click on the project and choose: Maven >> “Enable Dependency Managment”\"? Cheers, Pablo On Wed, Jun 6, 2012 at 1:31 PM, < > wrote: uHm, depends, I have two machines behind different proxies, one says: $ hg clone Abbruch: Fehler: And the other: :~/tmp$ hg clone Abbruch: HTTP Error 403: Forbidden So perhaps it's a proxy thing? I should perhaps try to tell hg the proxy I'm really used to eclipse and would like to use a different IDE. Well, I can't try it cause there is no project yet. Thanks, Felix Von: Pablo Mendes [mailto: ] Gesendet: Mittwoch, 6. Juni 2012 14:47 An: Burkhardt, Felix Cc: Betreff: Re: [Dbpedia-discussion] Checking out the DBpedia Extraction Framework Hi Felix, What happens from command line? hg clone Also, I'd personally suggest going with IntelliJ IDEA since the scala plugin there seems to work better. I think Jona uses Eclipse, though. Did you try the trick \"Right-click on the project and choose: Maven >> \"Enable Dependency Managment\"\"? Cheers, Pablo On Wed, Jun 6, 2012 at 1:31 PM, < > wrote: Hi all I tried to download the extraction framework following this tutorial but got an error when trying to in Eclipse: \"File\" >> \"New\" >> \"Project\" >> Mercurial >> \"Clone Existing Mercurial Repository\" with   is this still supposed to be working? Sorry if I missed something obvious. Regards, Felix uI can't reproduce. Sounds like an error on your end. Perhaps this helps? Cheers, Pablo On Wed, Jun 6, 2012 at 2:59 PM, < > wrote: uSorry. I'm not a mercurial expert, so I'm afraid I can't help more here. If I may use this opportunity to introduce myself: Yes. I remember you from LREC. Welcome to the list! :) we have projects that require NER and ontological classification and I have Perhaps you want to take a look at DBpedia Spotlight? :) Cheers, Pablo On Wed, Jun 6, 2012 at 4:53 PM, < > wrote: uI tried to set the proxy for hg but doesn't work. Tried both machines (Win and Linux) and command line like this hg" "Bad request exception when getting urispaces of DBpedia" "uHi, I want to see the different resources that have an urispace different from \" My query is SELECT ?s WHERE { ?s ?p ?o FILTER NOT EXISTS { ?s ?p ?o FILTER regex(str(?s), \" } } LIMIT 1 and i execute this query like this : String query = setQuery(firstUriSpace); QueryExecution execution = QueryExecutionFactory.sparqlService(endpoint, query); try { ResultSet set = execution.execSelect(); while (set.hasNext()) { Resource newResource = set.next().getResource(\"s\"); } } catch (Exception e) { e.printStackTrace(); execution.close(); } execution.close(); And i took this exception : HttpException: HttpException: 400 Bad Request: HttpException: 400 Bad Request at com.hp.hpl.jena.sparql.engine.http.HttpQuery.execCommon(HttpQuery.java:350) at com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:189) at com.hp.hpl.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:144) at com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:141) at main.VOIDExtractor.extractBasicPropertiesManuallyForBigDataset(VOIDExtractor.java:819) at main.VOIDExtractorTest.extractUrispacesOfVeryBigDatasetManuallyTest(VOIDExtractorTest.java:484) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: HttpException: 400 Bad Request at com.hp.hpl.jena.sparql.engine.http.HttpQuery.execCommon(HttpQuery.java:299) 29 more This query works on web site \" jena it didn't work. How can i execute from jena? Thank you. uHi Ziya, sorry of the belated reply. On 05/15/2012 03:24 PM, Ziya Akar wrote: The following code fragment works for me: String query = \"SELECT ?s WHERE { ?s ?p ?o. FILTER NOT EXISTS {?s ?p ?o\n\" + \" FILTER regex(str(?s), \\" \" }\n\" + \" } LIMIT 1\"; String endPoint = \" QueryEngineHTTP queryEngine = new QueryEngineHTTP(endPoint, query); try { ResultSet set = queryEngine.execSelect(); while (set.hasNext()) { Resource newResource = set.next().getResource(\"s\"); System.out.println(newResource); } } catch (Exception e) { e.printStackTrace(); queryEngine.close(); } queryEngine.close();" "How long is DBpedia going to be down ???" "u uHi Somesh Jain, The site is already back online. Patrick0€ *†H†÷  €0€1 0 +" "Help with my first dbpedia select - Simpsons Episodes almost working" "uHi all, I've tried and tried but I cannot understand how to get my select to do what I want. here is what I have and it works fine PREFIX owl: PREFIX xsd: PREFIX rdfs: PREFIX rdf: PREFIX foaf: PREFIX dc: PREFIX : PREFIX dbpedia2: PREFIX dbpedia: PREFIX skos: PREFIX dbprop: SELECT * WHERE { ?episode < ?episode dbpedia2:season ?season . ?episode dbpedia2:airdate ?created . ?episode dbpedia2:blackboard ?blackboard .FILTER ( lang(?comment) = \"en\" ). ?episode dbpedia2:episodeName ?episodeName . ?episode dbpedia2:episodeNo ?episodeNo . ?episode rdfs:comment ?comment . FILTER ( lang(?comment) = \"en\" ) } All I want to do is change it so that it returns all seasons, not just season 10. I know there is a listing of series here - So if someone could please take pity on a newby and let me know how to change the select? uHi Teddy, you can try with the following PREFIX owl: PREFIX xsd: PREFIX rdfs: PREFIX rdf: PREFIX foaf: PREFIX dc: PREFIX : PREFIX dbpedia2: PREFIX dbpedia: PREFIX skos: PREFIX dbprop: SELECT * WHERE { ?subject skos:broader . ?episode ?subject . ?episode dbpedia2:season ?season . ?episode dbpedia2:airdate ?created . ?episode dbpedia2:blackboard ?blackboard .FILTER ( lang(?comment) = \"en\" ). ?episode dbpedia2:episodeName ?episodeName . ?episode dbpedia2:episodeNo ?episodeNo . ?episode rdfs:comment ?comment . FILTER ( lang(?comment) = \"en\" ) } Is that what you need? Regards Andrea 2013/1/8 Andy 'Ted' Tedford < >: uHi Teddy, On 01/08/2013 01:10 PM, Andy 'Ted' Tedford wrote: try the following one: SELECT * WHERE { ?episode ?seasonOfChoice. ?seasonOfChoice skos:broader . ?episode dbpedia2:season ?season . ?episode dbpedia2:airdate ?created . ?episode dbpedia2:blackboard ?blackboard .FILTER ( lang(?comment) = \"en\" ). ?episode dbpedia2:episodeName ?episodeName . ?episode dbpedia2:episodeNo ?episodeNo . ?episode rdfs:comment ?comment . FILTER ( lang(?comment) = \"en\" ) } uTry this one: PREFIX dbprop: PREFIX dcterms: SELECT DISTINCT * WHERE { ?episode dcterms:subject ?subject ; dbprop:season ?season ; dbprop:airdate ?created ; dbprop:blackboard ?blackboard ; dbprop:episodeName ?episodeName ; dbprop:episodeNo ?episodeNo ; rdfs:comment ?comment . FILTER ( langMatches( lang( ?comment ), \"EN\" ) ) . FILTER regex( ?subject, \"^ } On Tue, Jan 8, 2013 at 2:10 PM, Andy 'Ted' Tedford < > wrote:" "Is it possible to find if the article corresponds to a location ??" "uHi people, Suppose I have an article name. Is it possible for me to find if that article is a location or not. When I say location , I am including all the continents, countries, counties, streets etc , anything with some latitude, longitude associated with it. Help me uHi Somesh, On 10/25/2012 12:35 PM, Somesh Jain wrote: does that query do what you want, assuming your test article is \"Paris\" : ASK WHERE { dbpedia:Paris a dbpedia-owl:Place.}" "Mapping-based Properties (Cleaned)" "uHello DBpedia developers, One of the files in the 3.9 release is described as: Mapping-based Properties (Cleaned) This file contains the statements from the Mapping-based Properties, with incorrect statements identified by heuristic inference being removed. Is there documentation or code describing/implementing how this cleaning is done? Also is there some evaluation for how effective it is? I am working on cleaning and extending extracted relations as well and I would like to know the state of the art. Thanks, Michael Hello DBpedia developers, One of the files in the 3.9 release is described as: Mapping-based Properties (Cleaned) This file contains the statements from the Mapping-based Properties, with incorrect statements identified by heuristic inference being removed. Is there documentation or code describing/implementing how this cleaning is done? Also is there some evaluation for how effective it is? I am working on cleaning and extending extracted relations as well and I would like to know the state of the art. Thanks, Michael" "Announcing Virtuoso Open-Source Edition v 6.1.0" "uHi, OpenLink Software is pleased to announce the official release of Virtuoso Open-Source Edition, Version 6.1.0: IMPORTANT NOTEfor up-graders from pre-6.x versions: The database file format has substantially changed between VOS 5.x and VOS 6.x. To upgrade your database, you must dump all data from the VOS 5.x database and re-load it into VOS 6.x. Complete instructions may be found here. IMPORTANT NOTEfor up-graders from earlier 6.x versions: The database file format has not changed, but the introduction of a newer RDF index requires you run a script to upgrade the RDF_QUAD table. Since this can be a lengthy task and take extra disk space (up to twice the space used by the original RDF_QUAD table may be required during conversion) this is not done automatically on startup. Complete instructions may be found here. New and updated product features include: * Database engine - Added new 2+3 index scheme for RDF_QUAD table - Added new inlined string table for RDF_QUAD - Added optimizations to cost based optimizer - Added RoundRobin connection support - Removed deprecated samples/demos - Fixed align buffer to sizeof pointer to avoid crash on strict checking platforms like sparc - Fixed text of version mismatch messages - Fixed issue with XA exception, double rollback, transact timeout - Merged enhancements and fixes from V5 branch * SPARQL and RDF - Added support for owl:inverseOf, owl:SymmetricProperty, and owl:TransitiveProperty. - Added DB.DBA.BEST_LANGMATCH() and bif_langmatches_pct_http() - Added initial support for SPARQL-FED - Added initial support for SERVICE { }; - Added support for expressions in LIMIT and OFFSET clauses - Added built-in predicate IsRef() - Added new error reporting for unsupported syntax - Added rdf box id only serialization; stays compatible with 5/6 - Added support for SPARQL INSERT DATA / DELETE DATA - Added SPARQL 1.1 syntax sugar re. HAVING CLAUSE for filtering on GROUP BY - Added special code generator for optimized handling of: SPARQL SELECT DISTINCT ?g WHERE { GRAPH ?g { ?s ?p ?o } } - Added support for HTML+RDFa representation re. output from SPARQL CONSTRUCT and DESCRIBE queries - Added support for output:maxrows - Improved SPARQL parsing and SQL codegen for negative numbers - Improved recovery of lists in DB.DBA.RDF_AUDIT_METADATA() - Fixed iSPARQL compatibility with 3rd party SPARQL endpoints - Fixed bad init in trans node if multiple inputs or step output values - Fixed redundant trailing '>' in results of TTL load when IRIs contain special chars - Fixed problem with rfc1808_expand_uri not using proper macros and allocate byte extra for strings - Fixed when different TZ is used, find offset and transform via GMT - Fixed graph-level security in cluster - Fixed redundant equalities in case of multiple OPTIONALs with same variable - Fixed BOOLEAN_OF_OBJ in case of incomplete boxes - Fixed NTRIPLES serialization of triples - Merged enhancements and fixes from V5 branch * Sponger Middleware - Added Extractor Cartridges mapping Zillow, O'Reilly, Amazon, Googlebase, BestBuy, CNET, and Crunchbase content to the GoodRelations Ontology. - Added Extractor Cartridges for Google Spreadsheet, Google Documents, Microsoft Office Docs (Excel, PowerPoint etc), OpenOffice, CSV, Text files, Disqus, Twitter, and Discogs. - Added Meta Cartridges covering Google Search, Yahoo! Boss, Bing, Sindice, Yelp, NYT, NPR, AlchemyAPI, Zemanta, OpenCalais, UMBEL, GetGlue, Geonames, DBpedia, Linked Open Data Cloud, BBC Linked Data Space, sameAs.org, whoisi, uclassify, RapLeaf, Journalisted, Dapper, Revyu, Zillow, BestBuy, Amazon, eBay, CNET, Discogs, and Crunchbase. * ODS Applications - Added support for ckeditor - Added new popup calendar based on OAT - Added REST and Virtuoso PL based Controllers for user API - Added new API functions - Added FOAF+SSL groups - Added feed admin rights - Added Facebook registration and login - Removed deprecated rte and kupu editors - Removed support for IE 5 and 6 compatibility - Merged enhancements and fixes from V5 branch Other links: Virtuoso Open Source Edition: * Home Page: * Download Page: OpenLink Data Spaces: * Home Page: * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: OpenLink uNathan wrote: We just have new Cartridges (Extractor and Meta). The Extractors are Open Source ( versions 5 & 6) while the Meta Cartridges are commercial only. Yes, re. SPARQL-GEO. Thus, the fundamental differentiators between the Open Source and Commercial Editions come down to: 1. Sponger's Meta Cartridges" "Announcement: DBpedia 3.1 Release" "uHello, hereby we announce the 3.1 release of DBpedia. As always, downloads are available at [1] and the list of changes since DBpedia 3.0 is in our changelog [2]. Some notable improvements are a much better YAGO mapping, providing a more complete (more classes assigned to instances) and accurate (95% accuracy) class hierarchy for DBpedia. The Geo extractor code has been improved and is now run for all 14 languages. URI validation has switched to the PEAR validation class. Overall, we now provide 6.0 GB (58,4 GB uncompressed) of downloadable nt and csv files. The triple count (excluding pagelinks) has surpassed the 100 million barrier and is now at 116.7 million triples, which is an increase of 27% compared to DBpedia 3.0. The extraction was performed on a server of the AKSW [3] research group. I would like to thank Sören Auer, Jörg Schüppel, Chris Bizer, Richard Cyganiak, Georgi Kobilarov, Christian Becker, the OpenLink team, and all other contributors for their DBpedia support. Kind regards, Jens Lehmann [1] [2] [3] http://aksw.org" "Linking to the Norwegian Company Registry" "uHi, in general, what is the process to get new extraction functionality added to dbpedia? Do I submit a feature request and wait and hope, or should I write code and propose it for inclusion in the extraction framework? More concretely: we are in the process of publishing lots of interesting information about Norwegian companies and organisations, taken from the official national registry ( linked data. This will be RDF data based on daily dumps of the registry. We will obviously try to link our data to dbpedia, but we think it would be cool to have dbpedia link to up-to-date information from the company register. In many cases, this shouldn't be too hard: Norwegian organisations are identified by a unique 9 digit \"organisation number,\" which will be part of the URIs of organisations. These organisation numbers are stated on many Norwegian wikipedia pages, either in an infobox as \"Org. nummer 123 456 789\" (see e.g. part of an external reference to \"Nøkkelopplysninger fra Enhetsregisteret\" (see e.g. ref 3 on the same page). In the latter case, the link goes to an info page created by the company registry, based on an organisation number in the URI. In other words, creating e.g. owl:sameAs triples for companies where the Norwegian wikipedia page gives an org. number should be easy. For those pages which don't yet give the org. number, it would be an extra incentive to add such information. So back to the start: what would be necessary to make this work? Yours, Martin uOn Wed, Feb 15, 2012 at 9:39 AM, Martin Giese < > wrote: I think the semantics of an external reference are much too loose to do anything useful with. Using infobox data is more reliable, but you should still be cautious about cases where the infobox doesn't refer to the same entity as the page that hosts it does. To make up an example, where a person who's the managing director of a company has an infobox for the company on his page. It may sound unlikely, but Wikipedia is riddled with stuff like this. Failing to identify cases where this happens will cause bad assertions to be made which will ripple downstream to cause contradictory inferences. Tom uHi, and thanks for your reply, Tom! On 2012-02-15 20:16, Tom Morris wrote: In general, I agree. But it would be possible to double check using the organization's name. Usually, the name in the register will be the same as the title of the Norwegian Wikipedia page. Again, using the name as additional identification should help here. Norway has the additional advantage of being so small that only a few hundred companies, organizations, and institutions are present on Wikipedia. (At least with an org. nr. For the rest, we have little hope of linking them up) Therefore, a manual quality check seems entirely possible. Martin uThere is always the option of creating another extractor and sending us a pull request. For your case, though, it would be an extremelly specialized extractor, with little chances of being reused by others. If many people start creating those, they become hard to maintain. I would say that the best solution would be to create mappings for all templates that you need uOn Thu, Feb 16, 2012 at 4:11 AM, Martin Giese < > wrote: Wikipedia categories tend to be pretty noisy (at least in English Wikipedia), but if you can get a reasonable set of candidates, the list sounds like it would be small enough that you could use Google Refine and the OpenCorporates reconciliation service for Refine to discover any missing registry numbers. Alternatively, you could use the DERI RDF extension and your triplified registry database to do the reconciliation directly against that (I think. I haven't done this personally). The RDF extension can also be used to generate RDF if you wanted to produce your owl:sameAs triples that way, although that's probably not a good workflow for anything other than a one-time effort. Tom" "Mapping editor rights" "uDBPedia community, I'd like to request editor rights for the mappings wiki. Initially, I plan on dealing with football player extraction and linking them to the football clubs for which they play. Bryan DBPedia community, I'd like to request editor rights for the mappings wiki. Initially, I plan on dealing with football player extraction and linking them to the football clubs for which they play. Bryan User:Bryan.burgers uHi Bryan, you got editor rights. Happy mapping! Anja On 13.01.2012 16:19, Bryan Burgers wrote:" "Labels and comments on mappings wiki" "uHi, it's great that there are mappings for more and more languages, but adding a mapping namespace is quite an elaborate task, partly because the ontology templates [1][2][3][4] use properties like \"rdfs: \", \"rdfs: \" etc. and we have to update all four templates when we add a new language. To make that easier, I added Template:Label [5] and Template:Comment [6] and added code to OntologyLoader.scala that parses them. Instead of \"rdfs: =foo\", please use {{label|en|foo}} in the future. Similar for comments. See the documentation of the ontology templates [1][2][3][4] for details. And please, whenever you edit an ontology item, change it to the new format for labels and comments. It only takes a minute. :-) When all old labels are comments are gone, we can remove the old code. Cheers, JC [1] [2] [3] [4] [5] [6] Template:Comment" "A request for the creation of the Arabic Chapter of DBpedia" "uDear Dimitris, I hope that you are in good health. I am Dr.Haytham Al-Feel the one who participated in the mapping of most of the Arabic infoboxes in the Arabic chapter of DBpedia in the past from 2012. Actually I think that it is the time for the creation of the Arabic Chapter of DBpedia to be added hopefully to other multilingual chapters, so we need your support and the internationalization Committee for that.Best Regards,Haytham uHello Haytham & welcome It will be great to have an Arabic chapter! I see you already have localized configurations for all extractors and some mappings defined. The mapping stats are not great [1] but I hope that building a community around your chapter will help you get more coverage. At the moment chapters use their own resources to provide a sparql endpoint, linked data interface and additional dumps e.g. it.dbpedia.org, de.dbpedia.oprg, nl.dbpedia.org etc Once we check that you have everything set-up we will set the ar.dbpedia.org subdomain to your server (This workflow might change in the future but at the moment this is how we do this) You can always get help from us through the the discussion list , or the dev list for technical matters. We also need tp track contacts details for your chapter but we recently switched to a new website and the workflow for editing the website is not yet stable. Cheers, Dimitris [1] On Tue, May 12, 2015 at 11:56 PM, haytham < > wrote:" "Missing rdfs:label" "uI'm starting to do some queries with SPARQL. However, some property values for rdfs: label is missing some resources. When I run this query: select distinct ?resource WHERE {?resource rdfs:label ?label. ?label bif:contains '\"nickelback\"'.} LIMIT 50 Does not appear the main page that it should be DBpedia: This is because some labels have been removed. But this feature appeared previously. Could someone explain me this? Many thanks for any help, I'm starting to do some queries with SPARQL. However, some property values for rdfs: label is missing some resources. When I run this query: select distinct ?resource WHERE {?resource rdfs:label ?label. ?label bif:contains ''nickelback''.} LIMIT 50 Does not appear the main page that it should be DBpedia: < help," ".bz2 problem" "uHello guys, Today I was trying to use the extraction framework to extract data for the Arabic language. When it comes to finding the file in the download directory (dump file), it didn't work, so after a while I figured that a part of code from the file Import.scala is written as follow : try { for (language <- languages) { val finder = new Finder[File](baseDir, language, \"wiki\") val tagFile = if (requireComplete) Download.Complete else \"* pages-articles.xml*\" val date = finder.dates(tagFile).last val file = finder.file(date, \"*pages-articles.xml*\") I tried to change the name to *\"pages-articales.xml.bz2\"* and the extraction successfully passed this point. My point is, don't you think that we should make the changes I mentioned above ? Because when we download the dump file, it comes with *\".bz2\"* in the name. Best regards, Ahmed. uHi Ahmed, in the default configuration files you will find the following lines # default: # source=pages-articles.xml # alternatives: # source=pages-articles.xml.bz2 # source=pages-articles.xml.gz You should comment / uncomments the ones that suit you Best, Dimitris On Sun, Apr 21, 2013 at 2:24 AM, Ahmed Ktob < > wrote: uHi, hm, no, sorry, in this case that won't work. The Import class is not configurable enough. I think Import.scala can't handle zipped files at all, so changing the name won't help either. I'll have a look, maybe I can fix this quickly. Cheers, JC On 21 April 2013 18:00, Dimitris Kontokostas < > wrote: uHi, Dimitris is right. Ahmed was referring to Import.scala, but that's probably not what's causing the problem. Ahmed, please try to edit the config file as Dimitris said and the extraction should work. You only need Import.scala if you want to extract abstracts. Anyway, I just added some code to make Import.scala more flexible. I also added a new argument in dump/pom.xml: users can now specify the name of the XML dump file, and Import.scala will automatically unzip if the suffix is .gz or .bz2. If you encouter any problems, let us know. Cheers, JC On 21 April 2013 18:08, Jona Christopher Sahnwaldt < > wrote: uAhmed, if things still don't work for you, please tell us exactly what you are trying to do: which Maven launcher? How do you start it? Please attach a copy of the configuration files and Scala files that you edited and a text file containing the complete Maven output. Cheers, JC On 21 April 2013 19:17, Jona Christopher Sahnwaldt < > wrote: uWell, first I should mention that I am using Intellij IDEA within Windows 7, I can't try now on Linux because my works on Windows and I haven't enough free space )) Also I am following this tutorial [1] to accomplish the Abstract Extraction. I followed it until when it comes to importing data, it didn't work for me with the error : java.lang.IllegalArgumentException: found no directory C:\Users\AHMED\Desktop\arwiki/[YYYYMMDD] containing file arwiki-[YYYYMMDD]-pages-articles.xml So I started reading the Import.Scala code and I figured maybe if I changed the code : val tagFile = if (requireComplete) Download.Complete else * \"pages-articles.xml\"* val date = finder.dates(tagFile).last val file = finder.file(date, *\"pages-articles.xml\"*) to *\"pages-articles.xml.bz2\" *maybe it will work.I did it and it worked (I passed this step). After the answer of Dimitris, I redo my changes and uncomment the source as he mentioned in both *extraction.abstracts.properties* & * extraction.default.properties *but I couldn't pass this step (the same error above). I am using Maven 3.0.4, and to start Maven I just followed the guide : clean -> install (on Parent Pom of the DBPedia framework) Scala:run (on DBpedia Dump Extraction) Currently, I want just the default extraction not the abstract, but I can't find a guide. Any suggestion ? Thank you so much. Cheers, Ahmed. [1] On 21 April 2013 18:19, Jona Christopher Sahnwaldt < > wrote: uOn 21 April 2013 19:46, Ahmed Ktob < > wrote: Please pull the latest version from github. Let git overwrite your changes in Import.scala. Maybe git can merge your changes in pom.xml (your folder) with the new parmeter (dump file name). uOK Jona, I will try this. Thank you. On 21 April 2013 20:18, Jona Christopher Sahnwaldt < > wrote: uGood news, now everything seems to work well. You have changed the constant value to a variable as I was thinking. Thanks for the changes and the help. Waiting the process to finish. Cheers, Ahmed. On 21 April 2013 21:30, Ahmed Ktob < > wrote:" "DBpedia properties" "uHi all, I am doing my thesis to get my B.Sc. I.T. (Hons) degree. As part of my Final Year Project (which is the thesis itself) I am investigating how use of ontologies could be made to help users bookmark webpages. I would like to use dbpedia as a kind of a 'universal ontology' in such a way that the bookmark file structure would be like the 'natural order of things'. Essentially, I was thinking of implementing something like this. Each web page would be represented by its most important topics while a category would be represented by its name. I would then like to query dbpedia to see if there exist a relationship between any of the topics and the category name. I would like to know which of the 8000 dbpedia relationships are most relevant to my task. I need relationships like skos:subject or perhaps rdf:type but I guess you guys could give me better advice since you know exactly what they mean. Also, how am I to get to the actual resource from the category name or the page topic. Will rdfs:label work? Thanks for your co-operation, uHi Savio, Sounds interesting. Yes, you could do this by querying the DBpedia SPARQL endpoint. See: Sorry, the relationships were extracted by automated algorithms from Wikipedia, so we also do not know what they exactly mean and you have to explore the dataset yourself to find the relevant parts. We are providing three different classification schemata for DBoedia entities, which could be good starting points for your work. See You get the resource by simply dereferencing its URI. There are rdfs:labels in different languages, so you should be fine. Please keep us up-to-date on how your work progresses. Cheers Chris" "Loading dbpedia into Virtuoso, bz2 problem" "uHello I'm following this guide to import all of the latest dbpedia into a Virtuoso instance: The loaderscript (found here: handles gzipped files, but not bz2'ipped ones. The DBpedia dumps are currently provided in bz2 format. Is there a work-around for this? I've contemplated modifying the script (which would require gz_file_open be able to handle bz2: Solutions that do not require recompression to gzip or full decompression on disk are very welcome. Michael uHI Michael, Virtuoso doe not have a built in bz2 function. We do have an old set of external scripts used for loading earlier versions of the DBpedia datasets and does external uncompress the bz2 files, and can be downloaded from: Although the Virtuoso RDFBulkLoader scripts are what were used for loading the current DBpedia 3.5.1 datasets and does load them a lot faster than the older scripts above would Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 27 Jun 2010, at 16:55, Michael Friis wrote: uyouuse this script (which doesn't seem to care for .bz2 files) and do you recompress the .gz files before load? Thanks a bunch! Michael uHi Michael, The procedure we used for loading the DBpedia 3.5.1 data sets into Virtuoso is detailed at: which you referenced in the initial email, and describes the steps we actually used for loading the data sets into the currently available SPARQL endpoint at Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 28 Jun 2010, at 07:58, Michael Friis wrote: uAh, overlooked \"Note the compressed bzip'ed \".bz2\" data set files need to be uncompressed \", I apologize. Michael" "Freebase, Wikidata and the future of DBpedia" "uHi DBpedians! As you surely have noticed, Google has abandoned Freebase and it will merge with Wikidata [1]. I searched the list, but did not find a discussion about it. So here goes my point of view: When Wikidata was started, I hoped it would quickly become a major contributor of quality data to the LOD cloud. But although the project has a potentially massive crowd and is backed by Wikimedia, it does not really care about the Linked Data paradigm as established in the Semantic Web. RDF is more of an afterthought than a central concept. It was a bit disappointing to see that Wikidata's impact on the LOD community is lacking because of this. Now Freebase will be integrated into Wikidata as a curated, Google engineering hardened knowledge base not foreign to RDF and Linked Data. How the integration will be realized is not yet clear it seems. One consequence is hopefully, that the LOD cloud grows by a significant amount of quality data. But I wonder what the consequences for the DBpedia project will be? If Wikimedia gets their own knowledge graph, possible curated by their crowd, where is the place for the DBpedia? Can DBpedia stay relevant with all the problems of an open source project, all the difficulties with mapping heterogeneous data in many different languages, the resulting struggle with data quality and consistency and so on? So I propose being proactive about it: I see a large problem of the DBpedia with restrictions of the RDF data model. Triples limit our ability to make statements about statements. I cannot easily address a fact in the DBpedia and annotate it. This means: -I cannot denote the provenance of a statement. I especially cannot denote the source data it comes from. Resource level provenance is not sufficient if further datasets are to be integrated into DBpedia in the future. -I cannot denote a timespan that limits the validity of a statement. Consider the fact that Barack Obama is the president of the USA. This fact was not valid at a point in the past and won't be valid at some point in the future. Now I might link the DBpedia page of Barack Obama for this fact. Now if a DBpedia version is published after the next president of the USA was elected, this fact might be missing from the DBpedia and my link becomes moot. -This is a problem with persistency. Being able to download old dumps of DBpedia is not a sufficient model of persistency. The community struggles to increase data quality, but as soon as a new version is published, it drops some of the progress made in favour of whatever facts are found in the Wikipedia dumps at the time of extraction. The old facts should persist, not only in some dump files, but as linkable data. Being able to address these problems would also mean being able to fully import Wikidata, including provenance statements and validity timespans, and combine it with the DBpedia ontology (which already is an important focus of development and rightfully so). It also means a persistent DBpedia that does not start over in the next version. So how can it be realized? With reification of course! But most of us resent the problems reification brings with it, the complications in querying etc. The reification model itself is also unclear. There are different proposals, blank nodes, reification vocabulary, graph names, creating unique subproperties for each triple etc. Now I won't propose using one of these models, this will surely be subject to discussion. But the DBpedia can propose a model and the LOD community will adapt, due to DBpedia's state and impact. I think it is time to up the standard of handling provenance and persistence in the LOD cloud and DBpedia should make the start. Especially in the face of Freebase and Wikidata merging, I believe it is imperative for the DBpedia to move forward. regards, Martin [1] bu3z2wVqcQc uHi Martin We discussed this issue a bit in the developer hangout, sadly to few people are usually present. On Tue, Jan 27, 2015 at 12:33 PM, Martin Brümmer < > wrote: I think it's more of a resource/implementation problem for them. Publishing linked data requires a major commitment and the tools for it are more than lacking in refinement. Wikidata and DBpedia are 2 different beasts. Wikidata is a wiki for structured data while DBpedia is an Information Extraction Framework with a crowdsourced component that is the mappings wiki. While wikidata might gain a lot of data from Freebase, it won't help them that much if Google does not give the Information Extraction framework behind Freebase. It would mean that the data would get old very fast and the community won't be able to update and maintain it. Though What exactly Google will do remains to be seen. I agree with being proactive, we have a lot of problems in DBpedia that need to be addressed. DBpedia is not only available in triples but also in N-quads. -I cannot denote a timespan that limits the validity of a statement. The problem of different changes during time in Wikipedia has been addressed in DBpedia Live and a demo has been presented at the last meeting in Leipzig under the title Versioning DBpedia Live using Memento [3] As you mentioned RDF reification can has drawbacks regarding performance and verbosity. We've had a similar need in one of the applications we developed, reified statements were simply impractical due to their verbosity and performance impact. The solution we came up with was using N-quads to use the 4th quad as an ID for an index. By looking up the ID you can find out information regarding provenance, time etc. I think this is more of a Graph Database problem. We should look at ways it can be implemented effectively in RDF-stores and then propose modifications to the RDF/Sparql standard if needed. Maybe the people from OpenLink or other RDF-Store researchers have some ideas on this issue. Cheers, Alexandru [1] [2] [3] Leipzig2014 uBut you can. Make an IntermediateNode (e.g. Barack_Obama1) and put what you will there. This is used all the time for CareerPost, the association node between a player and a team, etc. Political positions in many wikis are modeled with a lot of shopistication Position1: title, country/region/city/ from which party, etc Term11: which (1,2,3), from, to Colleague111: title (e.g. vicePresident), from, to Colleague112: title, from, to Term21 Position 2 Term21 Colleague211 The best you can map this to (for someone X) is X careerPost X_1: Postion1(title); Term11(from-to); coleague , . X careerPost X_2: Postion1(title) & Term12(from-to) X careerPost X_3: Position2; Term 21; colleague What you cannot map is point to an IntermediateNode of the colleague, and map the from/to of colleagues. (And you can only map their position's title if you use subprops, e.g. vicePresident uHi Martin, how daring that you started this discussion :D I just want to put my 2 cents in it. I think you are mixing things up. Wikipedia, DBpedia, Wikidata, and Freebase are more or less standalone projects. Some are synced, depending, or partially imported into another. But, there is no need and no use of fully importing Wikidata into DBpedia! Better get an RDF dump of Wikidata. The intended import of Freebase data to Wikidata will hardly be complete. One reason is that Freebase has no references of single facts to a particular source, which is a requirement for claims in Wikidata. I.e. unfortunately Freebase will never become imported to Wikidata completely. Freebase has it’s own community of contributors that provide and link facts into the knowledge base. Freebase’s biggest advantage is the easy import of own data. Time will show how this is adapted to Wikidata. Opposite there is DBpedia, which (currently) does not support manipulating A-Box facts. As Alexandru said, DBpedia is about extraction. Am 27.01.2015 um 13:46 schrieb Alexandru Todor < >: Indeed DBpedia community should think about a roadmap for future developments. I do not see any problem with restrictions of the RDF data model as a data exchange framework. But I admit there are some limitations with managing changes and also provenance. However, that is not relevant for most applications that want to work with this data. As Alexandru said, N-quads can be a solution for this. DBpedia extraction framework already supports multiple datasets, at least one for each extraction step. Actually I don’t know whether they are currently delivered or that is behind Virtuoso’s capabilities. Roles in time can also be represented by intermediate role instances. Freebase does that similarly, e.g. [ Speaking about modeling entities in time (and space) and since you are in Leipzig, I strongly recommend \"Ontologie für Informationssysteme - Vorlesung (H.Herre)\" [ DBpedia already supports Memento, which is an accepted standard for Linked Data. DBpedia versions are going back to 3.0. Let me know, if you are interested in this. A student of mine is working on that currently. Proposing best practice, models, technologies, and vocabularies for the LOD is definitely an imperative for DBpedia since it has been a central element and reference for a long time and should be further on. Best Magnus uMartin, When I first started working with RDF, I didn't fully \"get\" the full expressivity of it. All of the things you are saying can't be done (perhaps, easily?) are quite simple to implement. When compared to the property graph model, RDF, at first glance, seems inferior, but in reality, is much more expressive, in my opinion. Through reification, you can express all of the concepts that you are wanting to (provenance, date ranges, etc). At the end of the day, RDF's expressivity comes at the cost of verbosity, which, in my opinion is well worth it. If you would like some help in modeling your graph to represent the missing concepts that you are after, I will be happy to help you out with some more specific examples and pointers if it would be helpful to you. Aaron uDBpedia has a mission that is focused around extracting data from Wikipedia. Importing data wholesale from Wikidata or something like that seems to be inconsistent with that mission, but there are all kinds of temporal and provenance things that could be teased out of Wikipedia, if not out of the Infoboxes. I think most query scenarios are going to work like this [Pot of data with provenance information] -> [Data Set Representing a POV] -> query I've been banging my head on the temporal aspect for a while and I am convinced that the practical answer to a lot of problems is to replace times with time intervals. Intervals can be used to model duration and uncertainty and the overloading between those functions is not so bad because usually you know from the context what the interval is being used to represent. There is a lot of pain right now if you want to work with dates from either DBpedia or Freebase because different kinds of dates are specified to different levels of detail. If you make a plot of people's birthdays in Freebase for instance you find a lot of people born on Jan 1 I think because that is something 'plausible' to put in. A \"birth date\" could be resolved to a short interval (I know was I born at 4:06 in the afternoon) and astrologers would like to know that, but the frequent use of a calendar day is a statement about imprecision, although defining my \"birthday\" as a set of one day intervals the interval is reflecting a social convention. Anyway, there is an algebra over time intervals that is well accepted and could be implemented either as a native XSD data type or by some structure involving \"blank\" nodes. On Tue, Jan 27, 2015 at 11:22 AM, M. Aaron Bossert < > wrote: uPaul, The date ranges are doableI would say that one can still work either as-isand working with differing levels of specificityif you work with the dates as they are Aaron uWell, I got to Allen's algebra of intervals because I was concerned about how to deal with all of the different date time formats that are specified in XSD. All of these can be treated, correctly, as either an interval or a set of intervals. Note there are modelling issues that go beyond this. For instance, I still say we retain the birth date and death date properties even though you could model somebody's life as an interval. There are lots of practical reasons, but one of them is that I know my life is not an open ended interval although it looks like that now. Using this is a practical theory of time I can usually figure out what I need to know. I can say, however, if a person has a birthdate in Freebase of Jan 1, X, odds are far less than 0.5 that the person was born on that day. Thus, if I want to say anything abut people born on Jan 1, X and not look like a fool, I need to go through those facts and figure out which ones I believe. Thus, in some cases the data is really broken and energy must be spent to overcome entropy. On Tue, Jan 27, 2015 at 1:05 PM, M. Aaron Bossert < > wrote: uHi Alexandru, Am 27.01.2015 um 13:46 schrieb Alexandru Todor: I kind of disagree with you here. I regard and use DBpedia as a source of machine-readable linked data first. Because of its nature as derivative project extracting Wikipedia data, it is endangered by a potential future in which the Wikipedia crowd maintains their own machine-readable linked data to feed (among others) info boxes the DBpedia seeks to extract. I fear that, with Freebase becoming a part of Wikidata, this future becomes a little more likely to happen, even if we don't know what Google does, as you rightfully say. I did not know Memento and it's an interesting project, thanks. I would really like to see the problems addressed natively in the DBpedia, though. If DBpedia with its ever changing factual base, thanks to its data source, could present a way of handling persistence of facts and versioning, it could address the concerns of some EU projects that focus these points, like PRELIDA [1] and DIACHRON [2]. I totally agree with you regarding reification drawbacks. I thought about using graph name of the triple (that's what I understood the 4th URI in N-quads was commonly used for) as a statement identifier and would regard this as a better solution than using the RDF reification vocabulary, but I'm not sure how it impacts SPARQL querying and graph database performance. Dimitris made me aware of this proposal [3] by Olaf Hartig that needs an extension to turtle and SPARQL. Does anyone know if there is a comprehensive overview over the many different ways to do reification, addressing advantages and drawbacks of these models? I also agree that this is a graph database problem, but how can DBpedia tackle this issue, except OpenLink possibly reading these emails? regards, Martin [1] [2] [3] Martin Brümmer < > wrote: Hi DBpedians! As you surely have noticed, Google has abandoned Freebase and it will merge with Wikidata [1]. I searched the list, but did not find a discussion about it. So here goes my point of view: When Wikidata was started, I hoped it would quickly become a major contributor of quality data to the LOD cloud. But although the project has a potentially massive crowd and is backed by Wikimedia, it does not really care about the Linked Data paradigm as established in the Semantic Web. RDF is more of an afterthought than a central concept. It was a bit disappointing to see that Wikidata's impact on the LOD community is lacking because of this. I think it's more of a resource/implementation problem for them. Publishing linked data requires a major commitment and the tools for it are more than lacking in refinement. Now Freebase will be integrated into Wikidata as a curated, Google engineering hardened knowledge base not foreign to RDF and Linked Data. How the integration will be realized is not yet clear it seems. One consequence is hopefully, that the LOD cloud grows by a significant amount of quality data. But I wonder what the consequences for the DBpedia project will be? If Wikimedia gets their own knowledge graph, possible curated by their crowd, where is the place for the DBpedia? Can DBpedia stay relevant with all the problems of an open source project, all the difficulties with mapping heterogeneous data in many different languages, the resulting struggle with data quality and consistency and so on? Wikidata and DBpedia are 2 different beasts. Wikidata is a wiki for structured data while DBpedia is an Information Extraction Framework with a crowdsourced component that is the mappings wiki. While wikidata might gain a lot of data from Freebase, it won't help them that much if Google does not give the Information Extraction framework behind Freebase. It would mean that the data would get old very fast and the community won't be able to update and maintain it. Though What exactly Google will do remains to be seen. I kind of disagree with you here. I regard and use DBpedia as a source of machine-readable linked data first. Because of its nature as derivative project extracting Wikipedia data, it is endangered by a potential future in which the Wikipedia crowd maintains their own machine-readable linked data to feed (among others) info boxes the DBpedia seeks to extract. I fear that, with Freebase becoming a part of Wikidata, this future becomes a little more likely to happen, even if we don't know what Google does, as you rightfully say. So I propose being proactive about it: I agree with being proactive, we have a lot of problems in DBpedia that need to be addressed. I see a large problem of the DBpedia with restrictions of the RDF data model. Triples limit our ability to make statements about statements. I cannot easily address a fact in the DBpedia and annotate it. This means: DBpedia is not only available in triples but also in N-quads. -I cannot denote the provenance of a statement. I especially cannot denote the source data it comes from. Resource level provenance is not sufficient if further datasets are to be integrated into DBpedia in the future. -I cannot denote a timespan that limits the validity of a statement. Consider the fact that Barack Obama is the president of the USA. This fact was not valid at a point in the past and won't be valid at some point in the future. Now I might link the DBpedia page of Barack Obama for this fact. Now if a DBpedia version is published after the next president of the USA was elected, this fact might be missing from the DBpedia and my link becomes moot.     -This is a problem with persistency. Being able to download old dumps of DBpedia is not a sufficient model of persistency. The community struggles to increase data quality, but as soon as a new version is published, it drops some of the progress made in favour of whatever facts are found in the Wikipedia dumps at the time of extraction. The old facts should persist, not only in some dump files, but as linkable data. Being able to address these problems would also mean being able to fully import Wikidata, including provenance statements and validity timespans, and combine it with the DBpedia ontology (which already is an important focus of development and rightfully so). It also means a persistent DBpedia that does not start over in the next version. So how can it be realized? With reification of course! But most of us resent the problems reification brings with it, the complications in querying etc. The reification model itself is also unclear. There are different proposals, blank nodes, reification vocabulary, graph names, creating unique subproperties for each triple etc. Now I won't propose using one of these models, this will surely be subject to discussion. But the DBpedia can propose a model and the LOD community will adapt, due to DBpedia's state and impact. I think it is time to up the standard of handling provenance and persistence in the LOD cloud and DBpedia should make the start. Especially in the face of Freebase and Wikidata merging, I believe it is imperative for the DBpedia to move forward. The problem of different changes during time in Wikipedia has been addressed in DBpedia Live and a demo has been presented at the last meeting in Leipzig under the title Versioning DBpedia Live using Memento  [3] As you mentioned RDF reification can has drawbacks regarding performance and verbosity. We've had a similar need in one of the applications we developed, reified statements were simply impractical due to their verbosity and performance impact. The solution we came up with was using N-quads to use the 4th quad as an ID for an index. By looking up the ID you can find out information regarding provenance, time etc. I think this is more of a Graph Database problem. We should look at ways it can be implemented effectively in RDF-stores and then propose modifications to the RDF/Sparql standard if needed. Maybe the people from OpenLink or other RDF-Store researchers have some ideas on this issue. I did not know Memento and it's an interesting project, thanks. I would really like to see the problems addressed natively in the DBpedia, though. If DBpedia with its ever changing factual base, thanks to its data source, could present a way of handling persistence of facts and versioning, it could address the concerns of some EU projects that focus these points, like PRELIDA [1] and DIACHRON [2]. I totally agree with you regarding reification drawbacks. I thought about using graph name of the triple (that's what I understood the 4th URI in N-quads was commonly used for) as a statement identifier and would regard this as a better solution than using the RDF reification vocabulary, but I'm not sure how it impacts SPARQL querying and graph database performance. Dimitris made me aware of this proposal [3] by Olaf Hartig that needs an extension to turtle and SPARQL. Does anyone know if there is a comprehensive overview over the many different ways to do reification, addressing advantages and drawbacks of these models? I also agree that this is a graph database problem, but how can DBpedia tackle this issue, except OpenLink possibly reading these emails? regards, Martin [1] Leipzig2014 uThere's the interesting question of, if we were building something like Freebase today based on RDF, what sort of facilities would be built in for 'Wiki' management. That is, you need provenance metadata not so much to say \"These great population numbers for county X are from the world bank (and if you look closely they linearly interpolate between censuses which could be ten or more years apart\") but more to say \"User Z asserted 70,000 bogus triples\" On Tue, Jan 27, 2015 at 1:43 PM, Martin Brümmer < > wrote: uHi Magnus, Am 27.01.2015 um 15:12 schrieb Magnus Knuth: Well, I just felt like stirring up the community a bit so people have something to argue about while sitting in Dublin's beautiful pubs ;) I'm not so sure about that. From a LOD users perspective, the idea of a place that integrates encyclopedic knowledge in a comprehensive way with high quality is very attractive to me. I'm not alone with that, evidenced by DBpedia's central place in the LOD cloud. RDF dumps are not very easy and reliable to handle and most importantly not linked data. You might be right that Freebase can not be completely merged into Wikidata and that all projects will coexist in their own niche. However, I believe that even then it is a worthwhile cause to tackle triple level provenance, modelling time constraints and persistence of facts throughout DBpedia versions. It's interesting that you bring up manipulating A-Box facts. If we could address individual triples, making statements about them, including their validity and possibly correcting them indivdually without the change being lost after the next conversion could be possible. One might argue that these changes should be done directly in the Wikipedia, but this sometimes implies bureaucracy with Wikipedia editors that I would like to avoid. regards, Martin uOn 1/27/15 1:43 PM, Martin Brümmer wrote: Martin, DBpedia isn't *endangered*. Publishing content to a global HTTP based Network such as the World Wide Web isn't a \"zero sum\" affair. DBpedia's prime goal is to contribute to the Linked Open Data collective within the World Wide Web. To date, DBpedia has over achieved as the core that bootstrapped the Linked Open Data Cloud. Wikidata, Freebase, etcare complimentary initiatives. There gains or loses are not in any way affected by DBpedia. The Web is designed on a \"horses for courses\" doctrine. We can indeed all get along, and be successful, without anyone having to lose out :) [1] ?url=http%3A%2F%2Fdbpedia.org%2Fresource%2FZero-sum_game uWell said, Kingsley :-) As a lurker on this list, I found this an interesting discussion. I can understand Martin's sentiment, and the desire to \"do something\" to ensure that DBpedia will continue to be successful in the future. However, in my experience, such strategic discussions rarely lead to much. My advice is to simply make sure that DBpedia is the best it can be, without worrying too much about \"competitors\". Your users do not expect you to become like Wikidata or Freebase (or to import their data)" "DBpedia - Querying Wikipedia like a Database:Improved dataset released." "uHi Ilario, yes, this can easily be done as the dataset contains German, Italian and French short and long abstracts as well as links to the original Wikipedia pages in these languages. The DBpedia data is accassible through a SPARQL query endpoint at You can for instance ask queries like: SELECT ?name ?description_en ?description_de ?musician WHERE { ?musician skos:subject . ?musician foaf:name ?name . OPTIONAL { ?musician rdfs:comment ?description_de . FILTER (LANG(?description_en) = 'de') . } OPTIONAL { ?musician rdfs:comment ?description_fr . FILTER (LANG(?description_de) = 'fr') . } } to get the German and French abstract for German musicians. To test this query with the SNORQL query builder, click on: As the query endpoint is sometimes slow, you can also get all data from our download page and filter the German, French and Italien information out and store it locally. Please refer to: dataset via the SPARQL endpoint. You find information about the SPARQL query language here: and information about toolkits to store and process the dumps here: Please feel free to contact me or the mailing list with any further question about how to use the dataset. Cheers Chris" "New user interface for dbpedia.org" "uHey all, as some of you might know, our company has been developing Graphity - an open-source Linked Data client, which provides browser functionality and more. Here's an instance of it running on Linked Data Hub, rendering DBPedia resource of Tim Berners-Lee: You can compare it with the current interface: I think it is safe to say that user-friendliness is on another level. Also check out the SPARQL endpoint which contains an interactive query editor. I would like the DBPedia community to consider making Graphity the default Linked Data interface. After that, we could take things much further: enable editing mode, add custom layout modes etc. Please let me know what you think. The source code can be found here: Best regards, Martynas graphityhq.com uHey Paul, thanks for your feedback. I'm replying to the list as this is relevant. I do not claim this is the final layout, by no means. It's just our take on a generic Linked Data browser, and it is not tailored for DBPedia in any way. But it could be uI am a human being clicking on this and as such I expect something human readable. If you are going to show me a 404 I am going to interpret that as \"the system is broken\" rather than anything else. That is one of the reasons why semantic web gets so much scorn and anger from people who are not \"die hard\" members of the community or not graduate students in an academic system where you need to fit into a particular idiosyncratic culture to survive. You find this attitude all over the place, even at places like OMG, which does a lot of things that are complementary (and unfortunately parallel) to the W3C. For instance on this page you see a link to an OWL ontology if there was an \"it just works\" culture, you could point Protoge at this URL and it would load the ontology, but no, people couldn't make up their collective minds about what means, so you need to manually add the URIs for 10 or so imports in order to do this. If the C language or any of the thousands of other languages that have something like in them had unclear semantics about how to find the files you are importing, we would not be programming in C today, we just wouldn't. OWL has enough problems without stupid little things like this. (To be fair, just try loading the XMI files with the version of Eclipse that is recommended by the OMG and it is also not a \"it just works\" process.) There is another respected vendor in the semweb space that has an excellent triple store that has a \"biomedical browser\" that gives no results on searches like \"omeprazole\" or \"aspirin\" or \"cocaine\". People don't get excited about it, it is just another semweb demo that doesn't work. The difference between the semweb community and most of the others (say the people who post to \"Show HN\") is in most communities you click on a link and get a \"404\" it is perceived to be the responsibility of the person who is showing, and not a matter of blaming the end user or the upstream supply chain or anything else.) On Fri, Feb 6, 2015 at 12:55 PM, Martynas Jusevičius < > wrote: uPaul, I agree, the error message could be better and differentiate between \"Resource does not exist' and \"Resouse has no RDF content\". This doesn't changed the fact that you will not be able to browse Wikidata as it does not seem to serve RDF. What do you propose? I don't want to get distracted by details from the main question: is this interface good enough to replace the current one? If not, why not? Martynas On Fri, Feb 6, 2015 at 7:25 PM, Paul Houle < > wrote: uOn 2/6/15 11:46 AM, Martynas Jusevičius wrote: Why? You are adding a visualization to the mix. The tool in question is already listed on the applications collection[1] page currently maintained for the project. Please remember, Linked Open Data is all about loosely-coupling the following: 1. Object (Entity) Identity 2. Object (Entity) Description Location uKingsley, with all due respect, what are you talking about? What visualization? Did you look at the example? It is a generic Linked Data browser interface, which also can be used to publish Linked Data datasets such as DBPedia. All it uses to render the page is the RDF result it retrieves from the source. As to \"why?\" uOn 2/6/15 2:12 PM, Martynas Jusevičius wrote: Yes, of course I looked at the example. It's an HTML page. Just like the DBpedia green pages are HTML pages. HTML pages are ultimately visualization of data encoded using HTML (Hypertext Markup Language). It is a Document endowed with controls (courtesy of HTML). The controls in question enable a user lookup HTTP URIs that identity the subject, predicates, and objects of relations represented using RDF statements. That's it! uHey again, I posted this idea because you suggested so: I still object to your comments about this being a visualization, any more than the current \"green pages\" are. If you care to try, you will retrieve the origin RDF from the application: curl -H \"Accept: text/turtle\" What you see is the application working in browser/proxy mode. If you would deploy it on DBPedia SPARQL endpoint, it would be a Linked Data server which also happens to have the same (X)HTML view. What else do you expect? Fine, lets do a competition! On Fri, Feb 6, 2015 at 9:00 PM, Kingsley Idehen < > wrote: uFor a few basic things, the DBpedia green screens pack in a higher density of data per square inch meaning you can scan them with eyes rather than with your fingers. Also the DBpedia green screens tell you what the predicates and object URIs are so you have a leg up on writing SPARQL queries. Your interface might be more fashionable, but I don't see that it is concretely better (or that much worse) than what DBpedia already has. uOn 2/6/15 3:14 PM, Martynas Jusevičius wrote: Yes, and I am also now suggesting that we address this issue via a competition, since that's objective and democratic. Why do you think what you are doing is news to me? See: [1] Tim_Berners-Lee uDear Kingsley, Martynas, all We already have a new interface since August 2013 that is integrated with Virtuoso (but not yet deployed to dbpedia.org) see (click at the top-right to enable) You can find a related publication in LDOW2014 DBpedia Viewer - An Integrative Interface for DBpedia leveraging the DBpedia Service Eco System ( Denis Lukovnikov , Dimitris Kontokostas , Claus Stadler , Sebastian Hellmann , Jens Lehmann ) On Fri, Feb 6, 2015 at 11:15 PM, Kingsley Idehen < > wrote: uhi guys! if we are starting a competition, I candidate LodView :) I have made some implementation for the italian chapter of dbpedia, I hope that it could be online very soon bye, diego 2015-02-07 15:21 GMT+01:00 Dimitris Kontokostas < >: uHi all, Those of you interested in displaying data in browsable fashion may also find inspiration in Wikidata: This data browser was developed by Magnus Manske. Noteworthy features that you could consider are: * Data-based description text (at the top) * Timeline view (below data) * Map above the data (for things that have a location, e.g., * \"Meaningful\" grouping of data (family relations above other things, cross links to other databases and Wikipedia disentangled from the rest) * More embedded media and related media at the very bottom The view is available in any language (with varying coverage), e.g., Could be a nice student project to port some of these features to the DBpedia view as well. If you are using your own Virtuoso instance as a backend, you could also try your view with Wikidata's RDF data to compare it with the above :-). Best regards, Markus On 07.02.2015 17:09, Diego Valerio Camarda wrote: u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uOn 2/7/15 1:37 PM, Markus Kroetzsch wrote: This is a nice Interface that's actually omitting DBpedia identifiers. Anyway, I contacted the author about this and he has indicated an intent to eventually include DBpedia URIs. Thus, we have a nice interface to Linked Data with the strange characteristic of excluding DBpedia identifiers :( uOn 07.02.2015 20:04, Kingsley Idehen wrote: This is a misunderstanding. It is not a linked data interface, but simply a browser for the data in Wikidata (see it as an alternative UI to wikidata.org). I was only posting it here as a data browser example. Therefore, the simple reason why you cannot see DBpedia URIs is that the interface is displaying only data that is actually stored in Wikidata, and the community has not stored DBpedia URIs. This is understandable, since they are already storing the Wikipedia URLs, and it would seem like duplicating a really significant part of the overall data. There are similar issues with most of the other identifiers: they are usually the main IDs of the database, not the URIs of the corresponding RDF data (if available). For example, the Freebase ID of TimBL is \"/m/07d5b\" but the URI would be \" The community is in a tricky situation there, since most datasets do not use URIs as their primary IDs (which you need to find something on the site). Maybe we can have ways to specify string transformations with the data to have a way to go from primary data to linked data in such cases. Cheers, Markus uOn 2/7/15 2:44 PM, Markus Kroetzsch wrote: Not it isn't duplication. Wikipedia HTTP URLs identify Wikipedia documents. DBpedia URIs identify entities associated with Wikipedia documents. There's a world of difference here! Hmmif you look at the identifiers on the viewer's right hand side, you will find out (depending on you understanding of Linked Open Data concepts) that they too identify entities that are associated with Web pages, rather than web pages themselves. I don't really grasp the point you are trying to make. These should be non-issues when the semantics of relations and the nature of identifiers are understood, in regards to Linked Open Data and a Semantic Web. Kingsley uKingsley, The DBpedia interface is a fork of the existing green page interface. The problem you state may or may not existed at the forking time but this trivial to fix. I've personally asked you several times the last 1+ years to deploy the new interface to dbpedia.org is this is the reason you didn't so far? If you want to go with a competition I really don't mind. Here's two requirements from my side: 1) the new interface implements the triple action framework we introduced in our interface (see the paper for details) 2) there is a clear and easy way to deploy updates ps1. Aesthetic improvements are hard to measure but anything will be nicer than the existing pages ps2. req1 is very important since we plan to use it for services on top of DBpedia. Best, Dimitris Sent from my mobile, please excuse my brevity. On Feb 7, 2015 9:00 PM, \"Kingsley Idehen\" < > wrote: uWould be nice to make more of the icons work, or else remove them. The following icons don’t work: - Openlink Faceted Browser (top right, to the right of every Object, bottom footer) - Openlink Data Explorer (top right, bottom footer) - Relfinder (to the right of every Object) The FTS/autocomplete: searching for a resource on this de.dbpedia goes to en.dbpedia, showing the old interface E.g. searching for “tim ber” goes to Tim_Berners-Lee uHi Vladimir, These issues are already on my todo list, and will be fixed with the upcoming new release of the German dump. The FTP/autocomplete relies on dbpedia lookup, which has some unsolved issues when trying to adapt it to other languages, I don't know if this has been fixed yet. The DBpedia German Sparql endpoint needs to be added in the default relationfinder config for it to work. I don't have access to it, I need to see who does. (@Dimitris, any idea how I can get access there) The Openlink Faceted Browser and Data Explorer links need to be removed since it seems those services have been shut down. Cheers, Alexandru On Wed, Feb 18, 2015 at 6:18 PM, Vladimir Alexiev < > wrote: uOn 2/18/15 12:18 PM, Vladimir Alexiev wrote: +1 There's no reason why those links shouldn't work. In short, there's no reason why (circa., 2105) there isn't an RDF document (one updated regularly) that describes different Linked Data Browsers such that they can be presented as dynamic options to users. As I've stated, this interface needs more formalized testing, across the community. Kingsley uAnother glitch: if I remember correctly, it doesn't find everything. E.g. try autocomplete-search for Sofia and compare to same on (Sorry can't test now," "GSoC: Programming languages" "uHello everyone, I am considering applying to Google Summer of Code, particularly in Spotting task for DBpedia Spotlight. The problem is that I know Java well and work with it every day, but have never written a single line in Scala and barely heard about it. I've spent quite a while browsing the Spotlight code and still can't figure out, what's the rule - why some parts of the code are written in Scala and others in Java? Even small modules are split into two parts. Of course I'd like to take GSoC as an occasion to learn new, surprisingly popular language, and also expect myself to be able to read other's Scala code, but would definitely prefer to write my own in Java. Will it be a problem? Thanks, Piotr uHi Piotr, Let's continue this discussion at the correct list for DBpedia Spotlight: dbp-spotlight-developers. The codebase is mixed because we learned Scala while creating DBpedia Spotlight. I personally regret not jumping straight into Scala, as the system is not as smooth as it could because of that. We also accepted contributions in Java, and that ended up in the codebase. In the future we would like to move everything into Scala, but if your application is great, we could live with you writing in Java. Cheers, Pablo 2012/4/3 Piotr Przybyła < > u2012/4/3 Piotr Przybyła < >: I wouldn't think it would be much of a problem, but now is a good time to extend yourself :) Twitter's Scala School is a pretty good place to start: Scala comes with a REPL, so it's easy to play with. If I give you these lines, they're probably a little cryptic: val list = List((\"Foo\", 44), (\"Bar\", 7), (\"Baz\", 65)) val sort = list.sortWith(_._2 > _._2) but if you look at them in the repl: scala> val list = List((\"Foo\", 44), (\"Bar\", 7), (\"Baz\", 65)) list: List[(java.lang.String, Int)] = List((Foo,44), (Bar,7), (Baz,65)) scala> val sort = list.sortWith(_._2 > _._2) sort: List[(java.lang.String, Int)] = List((Baz,65), (Foo,44), (Bar,7)) it's easier to understand what's happening." "circular relationships in SKOS categories?" "uHi, I am working with the SKOD categories file hierarchy of Wikipedia categories, starting a t a specified category (Proteins for example). However, I am finding that some categories are children of each other. For example, given the following triple my understanding is that Enzymes is the parent of Enzymes_by_function. However I also find the triple: which states the reverse. Is this by design? Browsing Wikipedia categories at I see that Enzymes is the parent of Enzymes_by_function and not the other way around. if that's the case, how did DBpedia derive the above triples? Thanks, uIl 03/01/2011 18.57, Rajarshi Guha ha scritto: Hi Rajarshi, Wikipedia is made by men, and men are not perfect. If Wikipedia was written by God, such problems would not exist. If you find some disagreement between DBpedia and the current Wikipedia, it depends on the fact that DBpedia dumps refer to march 2010. If you run this SPARQL query: on this endpoint see such disagreement. All the best, roberto u(followup to There should not be cycles = loops = circular relationships in the #broader Category relationship" "Dataset containing article names present in a dbpedia category" "uHi people, I was looking at dbpedia dumps ( page. What I need is dataset which has all categories and article names of all the pages present in that category. Where is that ???? uPlease respond if anyone has any clue regarding the same. On Wed, Jun 27, 2012 at 12:18 PM, Somesh Jain < >wrote: uHi Somesh, check the Categories (Labels) & Categories (SKOS) datasets Cheers, Dimitris On Wed, Jun 27, 2012 at 10:16 AM, Somesh Jain < >wrote: uYeah, I have already checked 'em. Categories corresponding to articles is present but , the other way round thing is not there. I guess I'll have to do that myself with some code involved. On Wed, Jun 27, 2012 at 1:05 PM, Dimitris Kontokostas < >wrote: uIts exactly the same thing just filter on the 3rd column instead of the 1st On Wed, Jun 27, 2012 at 10:38 AM, Somesh Jain < >wrote:" "Weaker coupling please in configuration" "uI just brought my Spotlight up to date and had a lot of breakage in my configuration file because of things like this //TODO use separate candidate map candidateMapDirectory = config.getProperty(\"org.dbpedia.spotlight.candidateMap.dir\").trim(); if(!new File(candidateMapDirectory).isDirectory()) { //throw new ConfigurationException(\"Cannot find candidate map directory \"+ candidateMapDirectory); } LOG.info(\"Read candidateMap.dir, but not used in this version.\"); I get a null pointer exception because I don't define this property, which must point to a real directory. The big joke is that this is \"not used in this version\" so it's another nonworking artifact that gets people confused when they start working with Spotlight. Spotlight might benefit from dependency injection or another advanced approach to configuration, but it would help a lot to wrap code like the above in an existence check so I don't need to maintain a whole bunch of bogus configuration parameters that the system doesn't use! uPaul, Fair rant to the wrong list. :) Answer follows, but let's move the discussion. A long time ago we decided to use trunk as the live dev branch, and tags to mark stable points. When in time pressure, we (mostly I) end up committing snippets that were added for temporary tests, or for new features that aren't already finished, and shouldn't have made it to SVN. If you want a stable copy, please feel free to use tag/release-0.4 We are moving to git soon, where it will be easier to manage branches away from trunk. This little parameter is for a new feaure that was, in fact, requested by you, but that we did not have time to fully include for this release. You can now decide to load the candidate map in memory, but you still need to touch code for that. It is a working artifact that is not fully configurable yet. And yes, I got spoiled by Scala that doesn't return \"surprise nulls\", rather using the much more elegant Option. Forgot to test for that one null, and it shouldn't throw with the config file committed, so thank you for testing it with an outdated config so that I can remove this bug. About dependency injection, yes, we admit that configuration/dependency management in DBpedia Spotlight needs work. Please feel free to contribute that to the project. Or we will have to wait until I am not overloaded so that I can do it myself. Cheers Pablo On Sep 27, 2011 7:27 PM, \"Paul Houle\" < > wrote:" "YASGUI" "uUse this to play with dbpedia or any other SPARQL endpoint: Has automatic prefix insertion, class and property autocompletion, and Google Chart generation (borrowed from VISU, which itself borrowed the SPARQL editor from yasgui). I just can't say enough good words about it :-) We gradually started using it on our endpoints uHear, hear. I'm using the same thing in my front-ends. Way better than the plain code mirror SPARQL interface. Aaron uThis is pretty cool. One thing that would be fun is if there was a way to store queries in it along with some notes, kind of the way that Kasabi used to work. On Wed, Feb 18, 2015 at 1:05 PM, M. Aaron Bossert < > wrote: uAs the developer behind YASGUI: thanks for the compliments! About the functionality of storing queries: the next major release will feature a 'query collection' functionality, allowing you to bookmark and tag queries, group them, and share them. Best, Laurens On Wed, Feb 18, 2015 at 8:42 PM, Paul Houle < > wrote: u0€ *†H†÷  €0€1 0 + uOn Wed, Feb 18, 2015 at 3:17 PM, Laurens Rietveld < > wrote: Perhaps using Github gists like the iPython notebook viewer? would be nice. Another thing to look at for inspiration is the shared query history of the late great Freebase - Tom On Wed, Feb 18, 2015 at 3:17 PM, Laurens Rietveld < > wrote: As the developer behind YASGUI: thanks for the compliments! About the functionality of storing queries: the next major release will feature a 'query collection' functionality, allowing you to bookmark and tag queries, group them, and share them. Perhaps using Github gists like the iPython notebook viewer? Tom uFor whatever it is worth, I implement exactly that functionalitynot hard to do. I use local storage to hold query history and stored queries Aaron uI was indeed thinking about leveraging a bigger collaborative environment. Difficult to find one which supports text search, tagging, and easy API access. GitHub gists only partially fits the bill: in order to provide search functionality, I'd have to maintain an index of all the query gists myself. So other suggestions are welcome as well About the query history: that's something YASGUI supports already, exactly as you describe using local storage to hold the queries. gr Laurens On Thu, Feb 19, 2015 at 12:35 AM, M. Aaron Bossert < > wrote: uHi Laurens! - please add a goo.gl-shortened URL copy, like on VISU - yesterday on yasgui.org, a simple count(*) didn’t want to be charted (GraphDB returns them as xsd:integer), but then it went through ok. Maybe you fixed something on the server :-) uHi Vladimir, Yes, a URL shortener is on the list! Still looking for one which is CORS enabled though, in order to use it in the YASGUI Javascript library (without backend) as well. gr Laurens On Thu, Feb 19, 2015 at 5:04 PM, Vladimir Alexiev < > wrote:" "Some rdfs:labels are missing from articles_labels_pt.nt." "uHi, I'm currently upgrading my own Virtuoso's snapshot of DBpedia3.2 datasets to the 3.4 datasets. I have my own test query, 'SELECT ?s ?p ?o WHERE {?s rdfs:label \"Portugal\"@pt . ?s ?p ?o}' which stopped working. After frustrating hours debugging, I checked that in I found out that the articles_labels_pt.nt does not have the entry: < This entry is in fact included on the articles_labels_pt.nt from DBpedia3.2. While this 3.2 file has 253800 lines, the 3.4 one has 336933 lines, but as I pointed out, it misses important ones like the one above. Switching to 'Lisboa'@pt works, because the 3.4 dataset has the right entry, but I'm really concerned about the fact that there is this really important entry missing, which is really affecting my system's performance, and that there are possibly others also missing. Are you aware of this issue? (Sorry, I didn't searched your mailing-list archives). Is there a patch or another solution? Do you recommend to just load the articles_labels_pt.nt file from the 3.2 dataset OVER what I have now? Thanks, Nuno Cardoso Hi, I'm currently upgrading my own Virtuoso's snapshot of DBpedia3.2 datasets to the 3.4 datasets. I have my own test query, 'SELECT ?s ?p ?o WHERE {?s rdfs:label 'Portugal'@pt . ?s ?p ?o}' which stopped working. After frustrating hours debugging, I checked that in Cardoso" "Wikimedia Foundation Elections 2009" "uDear all, There are election going on at Wikimedia: As DBpedia, do we have a favorite candidate, whom we should promote? Sebastian" "Using other languages infoboxes" "uI am curious how dbpedia extracts information from infoboxes from non-english language pages. For example, here are 2 pages about the same historical figure in English and Russian: Russian page is more complete for obvious reasons. Its infobox contains information about father, mother, date of birth, death, wife. This information is not available in dbpedia at What stops dbpedia software from extracting this information? Is it just question of doing some translation or there are more complicated problems? uHi Alexey, yes, combining information from infoboxes in different languages would be great, but also get's extremely tricky. First there is the problem of translation. Second, there is the problem of contradictions. What you do with 20 different population figures for Berlin? Take the German one? Take the English one? Take the last updated one? Take the average? A completely open question is, wheather there are heuristics for this that can be applied to all properties or if you would need different heuristics for different properties? I would guess the last. So next question: What are good heuristics? What are good heuristics for deciding which heuristic to use one which property? So lots of open reseach questions and a big invitation for you to get active in this field ;-) Cheers Chris uOn 9/6/07, Chris Bizer < > wrote: For Berlin population I'd probably go with German one :-) I'd guess so Yes, you are right of course. I just hoped you already did hard work and just want some help with translation :-) I guess a lot of manual work would be required, this is probably another reason why Freebase is useful. I afraid my day job does not leave much time for it. I am interested in it as hobby, let's see how far it'll take me. I am still trying to understand the technology. Thanks! uChris Bizer wrote: I think besides to delve into these questions it would be great, if we could just perform the infobox extraction on different language versions and publish the extracted datasets separately. This should be fairly easy to do - can you guys in Berlin try or should I discuss this with Jens and Jörg here in Leipzig? This would also be a great basis for investigating the mentioned questions and often it might be simply reasonable to just take information from the English infobox extraction if available and from some other if not. Cheers, Sören uHi Sören, Yes, it would make sence to extract these datasets and provide them seperately for donwload. Technically, it should be easy. Just define a new extraction job that runs the infobox extractor over the different dumps. I think interesting questions that arise are: 1. Should you prefix property names with some language identifier in order to allow different infobox dataset to be loaded into the same store and still avoid name collisions and be able to track where different property values came from? I would say yes and maybe do something like dbpedia:property/de_plz as we don't know if plz means something else in Kroatian. 2. How to name resources? Many resource appear in different languages, some resources only apprear in a single Wiki. In order to keep the number of URIs low and to be able to easyly merge data from different languages, I would propose to keep on using the English article name as primary identifier for all languages and only add new identifiers for articles that only exist in specific non-English languages. Stuff like dbpedia:resource/de_RestaurantXYZinKreuzberg. What do you think? Georgi is moving to Bristol next week to work for half a year at HP Labs. Piet is busy with his master thesis. Therefore, it would be great, if Jens and Jörg could look into this. Could be an approach that is clearly worth testing. Cheers, Chris uHi all Indeed. It raises the question to know if the \"same\" wikipedia article in different languages have the \"same\" subject, and again how this \"sameness\" should be asserted in RDF. Wikipedia sameness of the article subject across languages versions is asserted by interwiki links, but this sameness is quite weak. Having different figures for Berlin population is less evil, it does not question the fact that both english and german articles more or less agree they speak about the same thing. There will be a lot of cases where more conflicting assertions will be found across languages versions, including the fact that two different articles in a language can link to only one in another language, etc. So, if dbpedia defines two distinct resources, such as How will you assert their sameness? Using owl:sameAs again is the more obvious way, but with expected inconsistencies, and automatic merging of information that we would like to keep distinct. Say we use some \"wiki:equivalent\" property and \"wiki:language\" to assert the language of the description - and that's where having languages as resources is interesting. wiki:language wiki:language wiki:equivalent One can use such triples to set SPARQL queries such as : what is the population of Berlin according to different versions of wikipedia? If resources are made distinct as above, seems that such collisions can be avoided. See above. Keeping the number of URIs and triples low is a trade-off vs seeking semantic accurracy. Again, if you want to be able to set queries with languages parameters, you need to have distinct resources for distinct languages, and languages as resources you can put as SPARQL variables BTW, I figured that various semantic browsers, Tabulator and Openlink as well, ciurrently somehow ignore the language labels. Meaning that the reflexion on multilingual aspects of the SW is yet in its infancy uHi all, In my opinion it's not reasonable to do our generic extraction on all languages and mix all results. It already is difficult for developers to find the right properties to use for a query, because there are ambiguous. For example we have three different properties describing the place of birth of a person, and let's imagine how that would look like with all languages. As it might be interesting for some developers like Alexey to have infobox data extracted from other languages we should provide that datasets, but separate from the main dataset. I'd suggest to follow the Wikipedia path with that and use language specific URIs, like from the German Wikipedia or Berlin from the Italian Wikipedia. These resources can be linked to the \"main\" resource-URI by whatever predicate. Another thing is matching specific infoboxes from other languages to the main dataset. If the Person Infobox template from Russian Wikipedia contains nice data about relations between persons, then we could deal with that like we did with the German Persondata template: We create a template-specific extractor which uses the main predicate vocabulary. Maybe we should think about processes to make this specific template extraction easier, but we should not mess things up by throwing everything together. Cheers, Georgi" "Some help with the lookup service." "uHello! I'm following instructions from run a local mirror of the lookup service and got an error on the instructions to clone and build dbpedia lookup I do: *git clone git://github.com/dbpedia/lookup.git*, and then: *cd lookup*, but when I do: *mvn clean install*, I get the error: *\"Failed to execute goal org.apache.maven.plugins:maven-war-plugin:2.1.1:war (default-war) on project dbpedia-lookup: Error assembling WAR: webxml attribute is required (or pre-existing WEB-INF/web.xml if executing in update mode)\"*. It seems to be an error about not having a web.xml file, but I've looked the files in web.xml file coming with it. I have Maven 3.0.5 installed. What may be the problem. I'd be gratefull if any of you helped me. I've already done a bunch of research but haven't found anything. Thanks. Hello! I'm following instructions from and then: cd lookup , but when I do: mvn clean install , I get the error: 'Failed to execute goal org.apache.maven.plugins:maven -war-plugin:2.1.1:war (default-war) on project dbpedia-lookup: Error assembling WAR: webxml attribute is required (or pre-existing WEB-INF/web.xml if executing in update mode)' . It seems to be an error about not having a web.xml file, but I've looked the files in there isn't really a web.xml file coming with it. I have Maven 3.0.5 installed. What may be the problem. I'd be gratefull if any of you helped me. I've already done a bunch of research but haven't found anything. Thanks." "GSoC Project Announcement" "uHello everyone, my name is Vincent and I will work on the project \" A Hybrid Classifier/Rule-based Event Extractor for DBpedia\" during this year's Google Summer Of Code. I'm happy to work for DBpedia Spotlight and with you during the summer. You can find the abstract to my project below. Sincerly, Vincent Bohlen Abstract: In modern times the amount of information published on the internet is growing to an immeasurable extent. Humans are no longer able to gather all the available information by hand but are more and more dependent on machines collecting relevant information automatically. This is why automatic information extraction and in especially automatic event extraction is important. In this project I will implement a system for event extraction using Classification and Rule-based Event Extraction. The underlying data for both approaches will be identical. I will gather wikipedia articles and perform a variety of NLP tasks on the extracted texts. First I will annotate the named entities in the text using named entity recognition performed by DBpedia Spotlight. Additionally I will annotate the text with Frame Semantics using FrameNet frames. I will then use the collected information, i.e. frames, entities, entity types, with the aforementioned two different methods to decide if the collection is an event or not. Hello everyone, my name is Vincent and I will work on the project \" A Hybrid Classifier/Rule-based Event Extractor for DBpedia \" during this year's Google Summer Of Code. I'm happy to work for DBpedia Spotlight and with you during the summer. You can find the abstract to my project below. Sincerly, Vincent Bohlen Abstract: In modern times the amount of information published on the internet is growing to an immeasurable extent. Humans are no longer able to gather all the available information by hand but are more and more dependent on machines collecting relevant information automatically. This is why automatic information extraction and in especially automatic event extraction is important. In this project I will implement a system for event extraction using Classification and Rule-based Event Extraction. The underlying data for both approaches will be identical. I will gather wikipedia articles and perform a variety of NLP tasks on the extracted texts. First I will annotate the named entities in the text using named entity recognition performed by DBpedia Spotlight. Additionally I will annotate the text with Frame Semantics using FrameNet frames. I will then use the collected information, i.e. frames, entities, entity types, with the aforementioned two different methods to decide if the collection is an event or not. uHi Vincent, Feel free to reuse the fact extractor codebase for your project: Cheers, On 5/9/16 09:44, Vincent Bohlen wrote: uA warm welcome from me as well Vincent! Can you also share with the community the public weekly update log you will use for your project? Cheers & welcome again Dimitris On Mon, May 9, 2016 at 10:44 AM, Vincent Bohlen < uHi Vincent, Welcome to the DBpedia discussion mailing list. It's good to see that you outreached to the larger DBpedia community. For people who are interested in his project, the github Repo is available under [1], where you can already find some functional code. The weekly reports will be available in the wiki [2]. Cheers, Alexandru [1] [2] On Mon, May 9, 2016 at 10:49 AM, Dimitris Kontokostas < > wrote:" "Disambiguation dataset" "uHi all, There was a feature request by Lee Humphreys some weeks ago on this list on extracting disambiguation pages. I've implemented an extractor and created a test-dataset, which is available for download at [1]. As I would like to ask for feedback first, the data is not available at our public Sparql-endpoint at the moment and is though not available as Linked data. I used a quite simple heuristic: If a Wikipedia article contains the disambiguation template {{disambig}}, create triples with Subject: PageTitle (and cut away \"_(disambiguation)\" like with Predicate: Object: all Wikilinks in that article which contain the disambiguation-term. Wikilinks that do not contain the Disambiguation-term (like on the Boston page) are not extracted. That's also a downside: if you look at you will see that recognized as valid disambiguation links. Suggestion are welcome Lee suggested an approach that deals with labels as well. I will investigate on that one later. BTW, a Redirects-dataset is on the way Cheers, and have a nice weekend Georgi [1] disambiguations.nt.bz2 uGeorgi, Very cool and useful. Looking forward to having this data in the store. Not sure if you've seen it, there's an issue about this in the tracker: Best, Richard On 9 Nov 2007, at 16:57, Georgi Kobilarov wrote:" "Querying and connecting data about architecture education" "uHello all, my name is Moritz Stefaner. I am new to this list, but have been following the semantic web and linking open data discussions for quite a while now. My main field of activity is user interface design and especially visualization. You can see some of my interests and work at my blog ( I am currently working on a EU funded research project, which aims at interconnecting eLearning contents about architecture and design ( ), and we would love to connect to and reuse existing information about architectural projects, architects, styles etc For a first impression, some interface demos can be found at: We are currently working on a widget-based web portal to allow metadata based search and browsing of architectural resources and projects. So we started to explore freebase and dbpedia about information about architectural projects, and several questions popped up. Forgive me if they already have been answered, I only superficially skimmed the archives. * We can easily get >15000 projects from freebase [1], even along with Wikipedia keys, but which of the multiple keys are the ones used in dbpedia? Is there any fixed rule/heuristics which to use? * The dbpedia data on buildings seems richer - but I can't find out how to retrieve ALL architectural projects in dbpedia. I tried [2] at , but it stops at * Also, I find it hard to find out which properties buildings typically have in dbpedia. How do you do that? Just inspect lots of instances? Any tricks there? I would appreciate any help and advice. Please answer in simple terms as we are quite new to this and mostly application/front-end oriented :) Beste Grüße, Moritz Stefaner ° Moritz Stefaner Interaction Design Lab FH Potsdam ° [1] [{\"namespace\" : \"/wikipedia/en\",\"value\" : null}],\"limit\" : 100000,\"name\" : null,\"type\" : \"/architecture/structure\"}]} [2] PREFIX rdf: SELECT ?subject WHERE { ?subject rdf:type } LIMIT 100000 ° uHello, mo wrote: [] What exactly do you mean by \"key\"? DBpedia URIs correspond to Wikipedia articles, i.e. information about the object described at Wikpedia article URIs. I don't know how Freebase handles it. There was some discussion that the Freebase guys want to publish their data as Linked Data and interlink it with DBpedia, but as far as I know that has not happened yet. Freebase does have dumps, which (they claim) can be easily converted to RDF. That might be a starting point. There is a limit of 1000 results on the SPARQL endpoint. So either you have to use limit and offset or you just download the complete YAGO file and grep for set is called \"YAGO classes\" and can be downloaded from the DBpedia download page [1]. Basically yes. They are available as Linked Data, e.g. you can just call topics are determined by Wikipedia and in particular the infoboxes. The Beijing_Zoo page [2] does not have an infobox, so you won't get much building specific information there. You could watch out for common Wikipedia infoboxes related to buildings (if those exist) to find typical properties. Kind regards, Jens [1] [2] Beijing_Zoo uHi, thanks for the reply. OK, so I will ask the freebase mailing list how they handle this. OK, thanks. I found 12.627 buildings in the Yago classes file, quite nice. I feared so :) Thanks! Beste Grüße, mo ° Moritz Stefaner +49 179 - 525 21 26 ° uHello, mo wrote: Please ask them to go for Linked Data and DBpedia interlinkage. :-) I just noticed that you don't get all buildings by just grepping for subclasses of building, which have instances assigned to them. To get all those you can add \"define input:inference ' beginning of your SPARQL query in your previous mail (this enables Virtuoso subclass inferencing) and then use LIMIT and OFFSET to find all solutions in several queries. You'll get 28582 buildings this way. The YAGO class hierarchy itself is also an interesting source of information (in particular in the upcoming DBpedia release), i.e. the class name alone may tell you something about a building. We will also interlink the classes with YAGO using owl:equivalentClass, so you may get additional information from there (soon). Kind regards, Jens uHi, On 08.08.2008, at 11:58, Jens Lehmann wrote: For the record, I found a thread which seems to explain the string mapping procedure. I do hope, there will be proper explicit linkage in the future, will definitely ask about that! Beste Grüße, mo ° Moritz Stefaner +49 179 - 525 21 26 ° uHi, On 08.08.2008, at 11:58, Jens Lehmann wrote: Before I roll my own: Does anybody have a script (e.g. Python?) for these kinds of iterative queries? Beste Grüße, mo ° Moritz Stefaner +49 179 - 525 21 26 ° uHi, I tried it, but define input:inference ' PREFIX rdf: PREFIX dbp: PREFIX yago: SELECT count(distinct(?project)) as ?num WHERE { ?project rdf:type yago:Building102913152. } gives me only 2959 results on (887 without the inferencing snippet) What am I doing wrong? (Oddly, if I replace rdf:type with a variable, I get even less.) Beste Grüße + thanks in advance, mo ° Moritz Stefaner +49 179 - 525 21 26 http://well-formed-data.net uHello, mo wrote: I tried to figure out what the problem might be, but I'm quite busy at the moment. The low number is a result of the new YAGO data sets in DBpedia 3.1., but I'm unsure why the number of buildings is so lower (I expected it to rise). One problem are remaining encoding issues in the YAGO DBpedia export, but this probably only marginally influences this number. What you could try is to visit the YAGO website [1] and try to find out whether the number of buildings in YAGO itself is higher than in DBpedia. Let us know if you get any insights. Note that YAGO uses different identifiers, e.g.: . That may be due to the fact that RDFS inferencing in Virtuoso is only done if rdf:type is explicitly given in the query. Kind regards, Jens [1]" "dbpedia ontology properties not getting extracted reliably from en Infobox : Film template" "uHi, It seems like important dbpedia ontology properties, such as starring, director, budget etc (generated from the mapping of en Infobox : Film) are not always present for several english films, although the corresponding wikipedia pages do have these properties and are using the Infobox : Film template. For example, consider the following english wikipedia page (which is using the Infobox : Film template) The dbpedia ntriples for the above page does not have the properties This wikipedia page for Blade Runner is also one of the test pages at the following url - The starring property is indeed present in the Infobox on the wikipedia page for this film ( However, the same dbpedia ontology property for starring does get extracted for some other films such as \"Army of Darkness\" I saw several cases of the above behavior of dbpedia ontology properties (starring, director, musicComposer, budget etc) missing for some films, while being present for some other films. However, in each case where the properties were missing in the dbpedia data set, the same properties were present on the corresponding wikipedia page. Is there a reason why the dbpedia ontology properties do not seem to be getting extracted for some films, while the same properties are getting extracted for some other films? ThanksArun uHi Arun, On 02/23/2013 06:39 AM, Arun Chippada wrote: Please have a look on that thread [1]. [1] msg04305.html uThanks Morsey. Would love to see this problem with the Infobox:Film mapping getting fixed. I am just getting started with using dbpedia. May be I can try contributing here later, if it doesn't get picked up by anyone. Date: Sat, 23 Feb 2013 09:39:13 +0100 From: To: CC: Subject: Re: [Dbpedia-discussion] dbpedia ontology properties not getting extracted reliably from en Infobox : Film template Hi Arun, On 02/23/2013 06:39 AM, Arun Chippada wrote: Hi, It seems like important dbpedia ontology properties, such as starring, director, budget etc (generated from the mapping of en Infobox : Film) are not always present for several english films, although the corresponding wikipedia pages do have these properties and are using the Infobox : Film template. For example, consider the following english wikipedia page (which is using the Infobox : Film template) The dbpedia ntriples for the above page does not have the properties This wikipedia page for Blade Runner is also one of the test pages at the following url - The test results generated on this page also do not have the properties for this film. The starring property is indeed present in the Infobox on the wikipedia page for this film ( However, the same dbpedia ontology property for starring does get extracted for some other films such as \"Army of Darkness\" I saw several cases of the above behavior of dbpedia ontology properties (starring, director, musicComposer, budget etc) missing for some films, while being present for some other films. However, in each case where the properties were missing in the dbpedia data set, the same properties were present on the corresponding wikipedia page. Is there a reason why the dbpedia ontology properties do not seem to be getting extracted for some films, while the same properties are getting extracted for some other films? Please have a look on that thread [1]. Thanks Arun [1] msg04305.html" "Stable DBpedia IRIs" "uDear DBpedia developers and users, the DBpedia URI for a Wikipedia page simply uses the page's title. If the page is renamed on Wikipedia, the DBpedia URI changes. But cool URIs don't change. [1] What can we do? Here's a simple solution: Use the Wikipedia page ID. When a Wikipedia page is renamed, only its title changes, not its page ID. So to have more stable URIs, we should (additionally) generate URIs based on the page ID. (When a Wikipedia page is deleted and later re-created, the page ID changes, but that is much rarer than renaming.) There are a few questions: uDear JC, let's not duplicate effort. DBpedia Lite [1] provides most of the things you elaborated in your email. All the best, Sebastian [1] On 05/15/2012 05:40 PM, Jona Christopher Sahnwaldt wrote: uThanks for the link! I had never heard of that project. I don't think it's duplicate effort though. DBpedia lite cannot resolve the renaming problems I mentioned. On Tue, May 15, 2012 at 5:46 PM, Sebastian Hellmann < > wrote: uOn 5/15/2012 11:40 AM, Jona Christopher Sahnwaldt wrote: Have you seen this? uSebastian just mentioned it and I replied. :-) On Tue, May 15, 2012 at 6:20 PM, Paul A. Houle < > wrote: uI guess I should add that thanks to some recent changes, it's entirely trivial to implement this. All we need is an extractor containing this one line, plus five to ten lines of boilerplate code: new Quad(context.language, DBpediaDatasets.PageIdUris, subjectUri, sameAs, page.language.resourceUri.append(\"_\"+page.id), page.sourceUri, null) On Tue, May 15, 2012 at 5:46 PM, Sebastian Hellmann < > wrote: uThe page_ids dataset is in the following form < I think It might be better to change it to (or add) < so as to have easier access on the id rather than introducing new identifiers Cheers, Dimitris On Tue, May 15, 2012 at 7:26 PM, Jona Christopher Sahnwaldt < uI think we should do both. I like stable URIs. On top of what we already do, we could also dereference I don't know how that would be handled in the VAD though. On Tue, May 15, 2012 at 6:48 PM, Dimitris Kontokostas < >wrote:" "dbpedia arabic chapter" "uhi all the Arabic dbpedia was published in January of this year, i don't found it in this URL ar.dbpedia.org, there is other URL what news of this project, initiated by Dr.haythem and dbpedia team. cordially hi all the Arabic dbpedia was published in January of this year, i don't found it in this URL  ar.dbpedia.org , there is other URL what news of this project, initiated by Dr.haythem and dbpedia team. cordially" "Bug (?) in SPARQL Explorer for sparql" "uHi, I just tried the DBpedia example query \"official websites of companies with more than 50000 employees\" [1] [2] The result shows 20 different companies, but displays 140000 employees for all of them. Christopher [1] [2] %0D%0ALIMIT+20%0D%0A" "Interesting DBpedia mashup: gwannon.com" "uI stumbled upon this today at Programmable Web and didn't see it mentioned here before: This is a mashup that puts notable places in the Earth's seas and oceans onto a Google Map, including shipwrecks, sunken cities, earthquake locations, nuclear test sites and so on. Much of the data seems to come from DBpedia, as well as some shipwreck and earthquake databases. The mashup also pulls photos from Flickr and weather data from some other web service. This is a nice example of data integration, where information from different sources is presented in a unified interface. I like the sense of exploration that one can get from browsing through the categories and oceans. I couldn't find any additional details about the project, so I don't know who is behind this. Best, Richard" "How frequently is DBPedia updated?" "uHi, Is there a place where the last update/import date of DBPedia is stated? I just noticed that, for example, the thumbnail for erroneous URL ( ). — Best regards, Behrang Saeedzadeh Hi, Is there a place where the last update/import date of DBPedia is stated? I just noticed that, for example, the thumbnail for Saeedzadeh uHi, dbpedia.org is updated on a 6-12 month basis. and for the latest updates you can refer to live.dbpedia.org (might be unresponsive due to maintenance) i.e. best, Dimitris On Fri, Nov 15, 2013 at 6:16 AM, Behrang Saeedzadeh < >wrote:" "Live: Can the wiki be updated with an install tutorial" "uHi, Mohamed, can you do a step-by-step tutorial on how to set up a DBpedia live environment? Including the MediaWiki mirror. Especially I would be interested in configuration to use another MediaWiki then Wikipedia (in my use case Wiktionary). Is that possible yet? Also I would like to ask: does DBpedia live already handles multiple language editions of Wikipedia? Or is it just for English now? Do I have to setup a live instance for each? Regards, Jonas uHi Jonas, On 07/07/2012 10:35 AM, Jonas Brekle wrote: it's OK, I'll add a step-by-step tutorial to DBpedia wiki. yes, you can do that, it may need some adaptation but I don't think that will be a big deal . Currently, DBpedia-Live supports English language only, but Dimitris will establish another one for Dutch soon." "Wikilink URI's need the resource name to have its first letter capitalised" "uHi all, In the PageLinks extraction code there are references to wikilinks being extracted from the text. The semantics that MediaWiki follows for this process is that it capitalises the first letter of any of these wikilinks when forming the equivalent URL, however, DBpedia isn't currently doing this. This results in both identifiers although MediaWiki treats them the same. See [1] and [2] that integrate the PageLinks versions into the rest of the datasets. This issue doesn't show up in the non-PageLinks datasets so there might be code already implemented for this somewhere else. Cheers, Peter [1] [2] dbpedia:Stage_fright" "how to use two ConditionalMappings?" "uHi! For the bg mapping we need to map \"íàñòàâêà\" to gender as described here: This works fine for simpler mappings e.g. of President (Mapping_bg:Ïðåçèäåíò_èíôî). But for the mapping of Musical artist we need two separate conditions: 1. gender Male or Female (conditioned on íàñòàâêà) 2. class MusicalArtist or MusicalGroup (conditioned on ôîí). I've tried various permutations and nestings but nothing works. As far as I can determine, only the following nestings are allowed: Although Edit>Validate passes, you can see broken nestings on the display page (around 1/3 of the page): And a condition can check only one field. Does that mean that a mapping cannot check two fields? Thanks for your help! Vladimir uSorry, stupid question. Subsequent Conditions can check several fields. - I've fixed - added tests at - documented at Cheers!" "Retrieve attributes given an dbpedia ontology class" "uHey, I am doing some work on semantic search and I may want to use dbpedia as seed data. The thing I want to do now is, given a dbpedia ontology class, say COMPANY, i want to get all attributes of COMPANY. I see there are 170+ classes and 900+ attributes in dbpedia ontology. For each object/instance, its type/class is specified in \"Ontology type\" dataset, its attributes are specified in \"Ontology infoboxes\" dataset. But seems no dataset directly specifies the attributes of a particular class, right? if it is, how to write a SPARQL script to get attributes of COMPANY? I have never written a SPARQL, hope to start with this one. can someone give some help? Thanks in advance Kenny Hey, I am doing some work on semantic search and I may want to use dbpedia as seed data. The thing I want to do now is, given a dbpedia ontology class, say COMPANY, i want to get all attributes of COMPANY. I see there are 170+ classes and 900+ attributes in dbpedia ontology. For each object/instance, its type/class is specified in 'Ontology type' dataset, its attributes are specified in 'Ontology infoboxes' dataset. But seems no dataset directly specifies the attributes of a particular class, right? if it is, how to write a SPARQL script to get attributes of COMPANY? I have never written a SPARQL, hope to start with this one. can someone give some help? Thanks in advance Kenny uSELECT * WHERE { {?s ?p ?o. FILTER(?s IN (< 2009/10/22 Kenny Guan < > uHi Pierre, thanks for your reply. Maybe I didn't describe my question clearly. the attributes of Company I want is like \"founder\", \"year of foundation\" etc. A more specific example, an instance of Company like, Apple, has an attribute \"founder\" with value of \"Steve Jobs\". does this make any sense? Many thanks Kenny On Thu, Oct 22, 2009 at 4:42 PM, Pierre De Wilde < >wrote: uOn 22 Oct 2009, at 12:57, Kenny Guan wrote: Try something like: SELECT DISTINCT ?p WHERE { ?s a . ?s ?p ?o . } This gets all resources of type Company, then finds all triples that have this resource as subject, and returns the predicate of these triples. The DISTINCT removes duplicates. Optionally you might want to also put this into the angle brackets, after the last dot: FILTER (REGEX(?p, 'ontology')) This removes all properties that do not have \"ontology\" in the URI, giving you only the properties defined by the ontology, and skipping general infobox properties. Hope that's what you're after. Best, Richard uHi Richard, This is exactly what I want! thank you so much. by the way, how the SPQRQL end point resolve \"?s a < p as and o as ? thanks Kenny On Thu, Oct 22, 2009 at 8:59 PM, Richard Cyganiak < >wrote: uKenny Guan wrote: There's also an OWL file that describes what specific properties are associated with dbpedia ontology classes, if that helps. Overall, partitioning can be a desirable feature for generic databases: although I like keeping a taxonomic skeleton of all of dbpedia and freebase around, for specific projects it's certainly useful to extract properties connected with a particular sort of entity. uOn Thu, Oct 22, 2009 at 10:06 PM, Paul Houle < > wrote: is there an already way to retrieve properties associated with an ontology class in OWL file? uOn 22 Oct 2009, at 14:20, Kenny Guan wrote: the rdf:type property. Best, Richard uOn Thu, Oct 22, 2009 at 16:23, Kenny Guan < > wrote: None that I'm aware of. It's should be quite simple though with some XML parsing or XSLT code. First find the class and all its base classes, then find all DatatypeProperty and ObjectProperty elements that declare one of these classes as their domain. The current ontology file can be found here: Be aware that a new version is coming up: Christopher uHello Kenny I'll try to show that your question can have different levels of answer depending on interpretation, all of them based on SPARQL queries altogether. As others have suggested, you can query the ontology level for properties of which Company class is the domain, by submitting the following at PREFIX rdfs: SELECT DISTINCT ?p WHERE {?p rdfs:domain } This gets the following list of properties. http://dbpedia.org/ontology/areaServed http://dbpedia.org/ontology/divisions http://dbpedia.org/ontology/footnotes http://dbpedia.org/ontology/locationcountry http://dbpedia.org/ontology/marketcap http://dbpedia.org/ontology/netincome http://dbpedia.org/ontology/operatingincome This is pretty straightforward, but you get only properties directly and exclusively attached to Company class through the rdfs:domain declaration. Other properties might be inherited from Company superclasses, named or constructed (such as unionOf classes) or locally attached to this class using OWL restrictions. Not sure the latter happens in the dbpedia ontology, but you can e.g. query for properties of which domain is a Company superclass. PREFIX rdfs: SELECT DISTINCT ?p WHERE { ?p rdfs:domain ?x. rdfs:subClassOf ?x } And you get a few more properties http://dbpedia.org/ontology/foundationplace http://dbpedia.org/ontology/keyPersonPosition http://dbpedia.org/ontology/product http://dbpedia.org/ontology/foundationdate http://dbpedia.org/ontology/foundationorganisation http://dbpedia.org/ontology/keyPerson http://dbpedia.org/ontology/foundationperson http://dbpedia.org/ontology/numberOfEmployees But this is only the ontology level and one might wonder which of the above are actually used in DBpedia *instances*. To figure it out, just try the following PREFIX rdfs: SELECT DISTINCT ?p WHERE { ?x a . ?x ?p ?y } and get the surprise to find out more than 1000 different properties actually used to describe companies, meaning that, to paraphrase Hamlet, there is more in heaven and earth instances than is dreamt of in the model philosophy. Why so? Some properties in this list such as \"length\" are indeed weird for a company, so you might want to look for Companies having a length PREFIX rdfs: SELECT DISTINCT ?x ?y WHERE { ?x a . ?x ?y } Among the list you find e.g., < http://dbpedia.org/resource/Orion_International>. And if you look at the description of that one, you find indeed that it has a length, but also a width and a height. Well, strange indeed, but looking further down the description you find the following values for rdf:type dbpedia-owl:MeanOfTransportation dbpedia-owl:Company dbpedia-owl:Resource dbpedia-owl:Automobile dbpedia-owl:Organisation It figures. If you are both Company and Automobile, strange things are bound to happen to you. Hoping this little excursion will help you understand better, on one hand the power of SPARQL, and on the other hand that DBpedia is far from being a consistent set of data. In an open world, the question \"What properties can a Company have?\" has no unique answer. Bernard 2009/10/22 Jona Christopher Sahnwaldt < > uHello Bernard, very nice analysis! Indeed. :-) DBpedia resource types are generated based on Infoboxes in Wikipedia articles. contains two Infoboxes - one for the company, one for its current product. We should use two different RDF subjects for the data extracted for these two infoboxes. It's on our (rather long) to-do-list Christopher uP.S. The main problem is devising a simple but reliable and useful scheme to generate RDF subject URIs for additional infoboxes. On Fri, Oct 23, 2009 at 01:14, Jona Christopher Sahnwaldt < > wrote: uHello Bernard, your analysis is awesome! i learned much from this post, thanks On Fri, Oct 23, 2009 at 6:45 AM, Bernard Vatant < >wrote: WHERE { ?p rdfs:domain ?x. rdfs:subClassOf ?x } will this get all properties of Company's farther classes or just one level up in the hierarchy? uHi Christopher I think this example actually raises the issue of the underlying Wikipedia (bad) practice of putting two infoboxes in the same page for two different subjects. If there is enough detailed information for the product \"Orion VII Next Generation\" to fill an infobox, this product IMO (with my Wikipedian hat on) deserves its own article. Meanwhile if there is more than one infobox in a page, take into account only the infobox of which title matches the article title seems to me a good heuristic. I don't think it's a good idea to create an URI for the additional subject, likely to clash with some other URIs. Suppose the product is called \"Bellatrix\" or \"Betelgeuse\", and you are into trouble. Bernard uOn Fri, Oct 23, 2009 at 10:39, Bernard Vatant < > wrote: I agree to some extent, but we have to accept the fact that Wikipedia articles are targeting human readers, and most features, even Infoboxes, are designed to make the layout nicer. Most editors and users don't care if DBpedia et al have a hard time extracting structured data. I found a lengthy Wikipedia discussion about this issue: AFAICT, there is no clear consensus, but the general opinion seems to be that multiple infoboxes should be avoided, but also have legitimate uses. I'm afraid we won't be able to change that. Good idea! But we'd have to relax that rule somewhat. In it's strict form, that rule would drop between 5 and 30 percent (rough estimate) of all articles. Clicking through some more or less random results on articles that wouldn't work: http://en.wikipedia.org/wiki/The_Wizard_of_Oz_(1939_film) http://en.wikipedia.org/wiki/Titanic_(1997_film) Correct. Maybe we could append the title of the infobox to the page title, creating URIs like http://dbpedia.org/resource/Orion_International/Orion VII Next Generation or http://dbpedia.org/resource/Orion_International#Orion VII Next Generation for the second infobox on http://en.wikipedia.org/wiki/Orion_International . Christopher uOn Fri, Oct 23, 2009 at 4:39 AM, Bernard Vatant < > wrote: The problem isn't limited to pages with multiple infoboxes. There are also cases where there's only a single infobox, but it isn't about the subject of the article. Examples that come to mind are, for example, military battles which have an infobox about a notable military commander or vice versa, but I'm sure there are lots of others. Also, the fact that the information in infoboxes is (semi-)structured doesn't make it correct. I was reviewing language infoboxes the other day and came across a bunch where the author had apparently decided that having blank fields was bad, so if there was an iso3 code, but not iso2 code, they'd instead use the iso2 code for the language's language family. Saying these practices are \"bad\" is unlikely to have any effect. The authors of those articles are writing for humans, not computers, and a human can instantly tell what the relationship is between the infobox(es) and the article or, slightly less reliably, why a fact that isn't really true was substituted for an empty field Tom" "DBpedia Live End points return different datafor same query" "uHello, Very surprising results. The difference is so big that if you choose a random resource from the large result set, then it will probably not exist in the small result set. Have you tried? By the way, how do the live endpoints ( - Amount of data - Response time - Availability Anybody knows about that? Regards, Juan Lucas De: A Shruti [mailto: ] Enviado el: jue 26/06/2014 20:42 Para: Asunto: [Dbpedia-discussion] DBpedia Live End points return different datafor same query Hi, I ran the following query against the two DBPedia live endpoints mentioned on the DBpedia Live website ( Query: PREFIX dbo: PREFIX rdf: PREFIX dbpprop: SELECT count(*) FROM WHERE { ?person rdf:type dbo:Person . ?person dbpprop:name ?name . ?person dbo:birthDate ?birthDate . ?person dbo:abstract ?abstract . ?person dbo:wikiPageID ?wikiPageID . ?person dbo:wikiPageRevisionID ?wikiPageRevisionID OPTIONAL { ?person dbo:wikiPageModified ?wikiPageModified } OPTIONAL { ?person dbo:wikiPageExtracted ?wikiPageExtracted } FILTER langMatches(lang(?abstract), \"en\") } Result when run on end point Result when run on end point Thanks, Shruti u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uThanks to all on their comments and info. I wanted to follow up on Rumi's comment: \" This data is updated regularly with the latest changes from Wikipedia itself.\" If the 2 end points for live data contain the same data then how come the query in my first email return such varying result (one returns a count of ~ 3 million while the other returns a count of ~63 million)? Based on Juan's comment about, I also tried querying for a smaller subset and see different results here as well. The query I used is: PREFIX  dbo: PREFIX  rdf: PREFIX  dbpprop: PREFIX  dbo: PREFIX  rdf: PREFIX  dbpprop: SELECT  ?wikiPageID count(*) FROM WHERE   { ?person rdf:type dbo:Person .     ?person dbpprop:name ?name .     ?person dbo:birthDate ?birthDate .     ?person dbo:abstract ?abstract .     ?person dbo:wikiPageID ?wikiPageID .     ?person dbo:wikiPageRevisionID ?wikiPageRevisionID     OPTIONAL       { ?person dbo:wikiPageModified ?wikiPageModified }     OPTIONAL       { ?person dbo:wikiPageExtracted ?wikiPageExtracted }     FILTER langMatches(lang(?abstract), \"en\")     FILTER (?wikiPageID = 365352 || ?wikiPageID = 39972083)   } GROUP BY ?wikiPageID On  wikiPageID        callret-1 39972083          8  On  wikiPageID        callret-1 39972083          2 365352            544 Why is that happening? ~ Shruti On Friday, June 27, 2014 7:39:39 AM, Kingsley Idehen < > wrote: Wikipedia itself. dbpedia.org has some extra data from other datasets that live will not have. project, so we make sure it runs most of the time, but we will not give guarantees for uptime and availability. gets a fair shot at running queries. such queries may timeout before giving meaningful data back. dbpedia in the cloud so they can have the same data but without competing with queries from other users. instance at: which provides a LOD Cloud cache. Naturally, this will not be as up to date as the DBpedia-Live instance in regards to data transformed from Wikipedia documents etc u0€ *†H†÷  €0€1 0 + uIf you want to run DBpedia queries against your own cloud instance, I suggest you try This has the data pre-loaded and you can be doing queries in ten minutes. There are other AMIs out there, but with perfectly matched hardware and software, this is the only one to meet the requirements for the AWS marketplace. On Fri, Jun 27, 2014 at 9:52 AM, Rumi < > wrote: uOk. As per the DBpedia live website ( ~ Shruti On Friday, June 27, 2014 12:13:01 PM, Kingsley Idehen < > wrote: uHello Juan, all DBpedia Live had some scaling issues and we had to proxy all our requests to The problem was fixed and we silently removed the proxy for testing 1-2 days before your email. The issue is described in detail in Unfortunately the mirrors (e.g. cannot auto heal and the errors remain in the db. We are currently working on a fix to generate a dbpedia Live dump that will be loaded on the mirrors and get them back to sync Sorry for any inconvenience this might caused but we are working on this Best, Dimitris On Fri, Jun 27, 2014 at 10:22 PM, Paul Houle < > wrote: uOK, thanks to all for the info. Juan De: Dimitris Kontokostas [mailto: ] Enviado el: lun 30/06/2014 8:39 Para: Paul Houle CC: Asunto: Re: [Dbpedia-discussion] DBpedia Live End points return different datafor same query Hello Juan, all DBpedia Live had some scaling issues and we had to proxy all our requests to The problem was fixed and we silently removed the proxy for testing 1-2 days before your email. The issue is described in detail in Unfortunately the mirrors (e.g. We are currently working on a fix to generate a dbpedia Live dump that will be loaded on the mirrors and get them back to sync Sorry for any inconvenience this might caused but we are working on this Best, Dimitris On Fri, Jun 27, 2014 at 10:22 PM, Paul Houle < > wrote: If you want to run DBpedia queries against your own cloud instance, I suggest you try This has the data pre-loaded and you can be doing queries in ten minutes. There are other AMIs out there, but with perfectly matched hardware and software, this is the only one to meet the requirements for the AWS marketplace. On Fri, Jun 27, 2014 at 9:52 AM, Rumi < > wrote: > Hi Juan, > > > On 27-Jun-14 9:06 AM, Juan Lucas Domínguez Rubio wrote: > > charset=unicode\" http-equiv=Content-Type> > Hello, > Very surprising results. The difference is so big that if you choose a > random resource from the large result set, then it will probably not exist > in the small result set. Have you tried? > > By the way, how do the live endpoints ( > > ( > > - Amount of data > - Response time > - Availability > > Anybody knows about that? > > > > http://dbedia-live.openlinksw.com/sparql so contain the same data. > This data is updated regularly with the latest changes from Wikipedia > itself. > > Statistics on the update process can be found at > http://dbpedia-live.openlinksw.com/live/ > > http://dbpedia.org/sparql/ is based on the 3.9 dataset as published by the > dbpedia team. > It is a static dataset that is refreshed about once a year > > So live.dbpedia.org has the newest data but in some cases dbpedia.org has > some extra data from other datasets that live will not have. > > As for Response time and Availability, it is a best effort project, so we > make sure it runs most of the time, but we will not give guarantees for > uptime and availability. > > There are rate limiters and ACLs in place to make sure everyone gets a fair > shot at running queries. > > Those limits can cause problems for certain kind of analysis as such queries > may timeout before giving meaningful data back. > In such cases we strongly recommend users to setup a version of dbpedia in > the cloud so they can have the same data but without competing with queries > from other users. > > Hope this helps. > > > Best Regards, > Rumi Kocis > > > Regards, > Juan Lucas > > > > > De: A Shruti [mailto: ] > Enviado el: jue 26/06/2014 20:42 > Para: > Asunto: [Dbpedia-discussion] DBpedia Live End points return different > datafor same query > > Hi, > I ran the following query against the two DBPedia live endpoints mentioned > on the DBpedia Live website (http://live.dbpedia.org/) and I get very > different results. Does anyone know why this could be happening? Do the > graphs \" >\" on these end points not contain the same data > set? > > Query: > PREFIX dbo: > PREFIX rdf: > PREFIX dbpprop: > > SELECT count(*) > FROM > > WHERE > { ?person rdf:type dbo:Person . > ?person dbpprop:name ?name . > ?person dbo:birthDate ?birthDate . > ?person dbo:abstract ?abstract . > ?person dbo:wikiPageID ?wikiPageID . > ?person dbo:wikiPageRevisionID ?wikiPageRevisionID > OPTIONAL > { ?person dbo:wikiPageModified ?wikiPageModified } > OPTIONAL > { ?person dbo:wikiPageExtracted ?wikiPageExtracted } > FILTER langMatches(lang(?abstract), \"en\") > } > > Result when run on end point http://live.dbpedia.org/sparql: 3792391 > > Result when run on end point http://dbpedia-live.openlinksw.com/sparql: > 63450663 > > Thanks, > Shruti > > CLÁUSULA DE PROTECCIÓN DE DATOS > Este mensaje se dirige exclusivamente a su destinatario y puede contener > información privilegiada o confidencial. Si ha recibido este mensaje por > error, le rogamos que nos lo comunique inmediatamente por esta misma vía y > proceda a su destrucción. > De acuerdo con la nueva ley Ley de Servicios de la Sociedad de la > Información y Comercio Electrónico aprobada por el parlamento español y de > la vigente Ley Orgánica 15/1999 de Protección de Datos española, le > comunicamos que su dirección de Correo electrónico forma parte de un fichero > automatizado, teniendo usted derecho de oposición, acceso, rectificación y > cancelación de sus datos. > > DATA PROTECTION CLAUSE > This message is meant for its addressee only and may contain privileged or > confidential information. If you have received this message by mistake > please let us know immediately by e-mail prior to destroying it. > In compliance with the new Information and Electronic Commerce Society > Services Law recently approved by the Spanish Parliament and with Organic > Law 15/1999 currently in force, your e-mail address has been included in our > computerised records in respect of which you may exercise your right to > oppose, access, amend and/or cancel your personal data. > > >" "Loading URLs with special characters" "uHi all, when I programmatically create an InputStream to download the RDF file for a URL with German \"umlauts\" such as then the RDF file is empty (but with no error). How do I get the triples? Thanks Holger Hi all, when I programmatically create an InputStream to download the RDF file for a URL with German \"umlauts\" such as < Holger uHi Holger, please check whether your program does correctly follow the 303 redirect to that's where you should get your rdf data from. Cheers, Georgi uI think this is already handled. I am working in Java and the code is: HttpURLConnection con = (HttpURLConnection) url.openConnection(); con.setRequestProperty(\"Accept\", \"application/rdf+xml\"); con.connect(); InputStream is = con.getInputStream(); and then some Jena calls that load the triples from the InputStream in RDF/XML serialization. At this stage, the response code is 200 and also con.getInstanceFollowRedirects() is true, so I assume that the API handles such cases automatically. Does anyone have working code for such cases? Thanks Holger On May 11, 2009, at 9:30 AM, Georgi Kobilarov wrote: uAppears to be a bug in the Virtuoso linked data deployment: :~$ curl -I -H \"Accept: application/rdf+xml\" \" \" HTTP/1.1 303 See Other Server: Virtuoso/05.10.3038 (Solaris) x86_64-sun-solaris2.10-64 VDB Connection: close Date: Mon, 11 May 2009 17:35:41 GMT Accept-Ranges: bytes TCN: choice Vary: negotiate,accept Content-Location: Schönbrunn_Palace.xml Content-Type: application/rdf+xml; qs=0.95 Location: Content-Length: 0 Note the Location header, those bytes should be %-encoded just like in the original URI. Richard On 11 May 2009, at 17:14, Holger Knublauch wrote: uOn May 11, 2009, at 01:40 PM, Richard Cyganiak wrote: Hmmm, so they should. Also the Content-Location header. I've raised this to the development team, and it will be fixed promptly. Be seeing you, Ted uOn May 11, 2009, at 03:09 PM, Ted Thibodeau Jr wrote: And uTed Thibodeau Jr wrote: All, Fixed and live. uI can confirm that it works now (from Java via TopBraid Composer, on behalf of a user). Thanks for the very fast turn around to everyone involved :) Holger On May 11, 2009, at 2:32 PM, Kingsley Idehen wrote:" "Invitation to contribute to DBpedia by improving the infobox mappings + New Scala-based Extraction Framework" "uHi all, in order to extract high quality data from Wikipedia, the DBpedia extraction framework relies on infobox to ontology mappings which define how Wikipedia infobox templates are mapped to classes of the DBpedia ontology. Up to now, these mappings were defined only by the DBpedia team and as Wikipedia is huge and contains lots of different infobox templates, we were only able to define mappings for a small subset of all Wikipedia infoboxes and also only managed to map a subset of the properties of these infoboxes. In order to enable the DBpedia user community to contribute to improving the coverage and the quality of the mappings, we have set up a public wiki at which contains: 1. all mappings that are currently used by the DBpedia extraction framework 2. the definition of the DBpedia ontology and 3. documentation for the DBpedia mapping language as well as step-by-step guides on how to extend and refine mappings and the ontology. So if you are using DBpedia data and you you were always annoyed that DBpedia did not properly cover the infobox template that is most important to you, you are highly invited to extend the mappings and the ontology in the wiki. Your edits will be used for the next DBpedia release expected to be published in the first week of April. The process of contributing to the ontology and the mappings is as follows: 1. You familiarize yourself with the DBpedia mapping language by reading the documentation in the wiki. 2. In order to prevent random SPAM, the wiki is read-only and new editors need to be confirmed by a member of the DBpedia team (currently Anja Jentzsch does the clearing). Therefore, please create an account in the wiki for yourself. After this, Anja will give you editing rights and you can edit the mappings as well as the ontology. 3. For contributing to the next DBpedia relase, you can edit until Sunday, March 21. After this, we will check the mappings and the ontology definition in the Wiki for consistency and then use both for the next DBpedia release. So, we are starting kind of a social experiment on if the DBpedia user community is willing to contribute to the improvement of DBpedia and on how the DBpedia ontology develops through community contributions :-) Please excuse, that it is currently still rather cumbersome to edit the mappings and the ontology. We are currently working on a visual editor for the mappings as well as a validation service, which will check edits to the mappings and test the new mappings against example pages from Wikipedia. We hope that we will be able to deploy these tools in the next two months, but still wanted to release the wiki as early as possible in order to already allow community contributions to the DBpedia 3.5 release. If you have questions about the wiki and the mapping language, please ask them on the DBpedia mailing list where Anja and Robert will answer them. What else is happening around DBpedia? In order to speed up the data extraction process and to lay a solid foundation for the DBpedia Live extraction, we have ported the DBpedia extraction framework from PHP to Scala/Java. The new framework extracts exactly the same types of data from Wikipedia as the old framework, but processes a single page now in 13 milliseconds instead of the 200 milliseconds. In addition, the new framework can extract data from tables within articles and can handle multiple infobox templates per article. The new framework is available under GPL license in the DBpedia SVN and is documented at The whole DBpedia team is very thankful to two companies which enabled us to do all this by sponsoring the DBpedia project: 1. Vulcan Inc. as part of its Project Halo (www.projecthalo.com). Vulcan Inc. creates and advances a variety of world-class endeavors and high impact initiatives that change and improve the way we live, learn, do business ( 2. Neofonie GmbH, a Berlin-based company offering leading technologies in the area of Web search, social media and mobile applications ( Thank you a lot for your support! I personally would also like to thank: 1. Anja Jentzsch, Robert Isele, and Christopher Sahnwaldt for all their great work on implementing the new extraction framework and for setting up the mapping wiki. 2. Andreas Lange and Sidney Bofah for correcting and extending the mappings in the Wiki. Cheers, Chris uOn Freitag, 12. März 2010, Chris Bizer wrote: I checked out the source from SVN and ran \"mvn -X package\" and I get a \"File name too long\" error. Any idea what might be the cause of this? I'm using maven 2.2.1 under Ubuntu 9.10. Here's more context of the error message: [INFO] includes = [/*.scala,/*.java,] [INFO] excludes = [] [INFO] /home/dnaber/prg/dbpedia/src/main/java:-1: info: compiling [INFO] /home/dnaber/prg/dbpedia/src/main/scala:-1: info: compiling [INFO] Compiling 133 source files to /home/dnaber/prg/dbpedia/target/classes at 1268682448883 [DEBUG] use java command with args in file forced : false [DEBUG] plugin jar to add :/home/dnaber/.m2/repository/org/scala- tools/maven-scala-plugin/2.13.2-SNAPSHOT/maven-scala-plugin-2.13.2- SNAPSHOT.jar [DEBUG] cmd: /usr/lib/jvm/java-6-sun-1.6.0.15/jre/bin/java -classpath /home/dnaber/.m2/repository/org/scala-lang/scala- compiler/2.8.0.Beta1/scala- compiler-2.8.0.Beta1.jar:/home/dnaber/.m2/repository/org/scala-lang/scala- library/2.8.0.Beta1/scala- library-2.8.0.Beta1.jar:/home/dnaber/.m2/repository/org/scala-tools/maven- scala-plugin/2.13.2-SNAPSHOT/maven-scala-plugin-2.13.2-SNAPSHOT.jar - Xbootclasspath/a:/home/dnaber/.m2/repository/org/scala-lang/scala- library/2.8.0.Beta1/scala-library-2.8.0.Beta1.jar org_scala_tools_maven_executions.MainWithArgsInFile scala.tools.nsc.Main /tmp/scala-maven-623058189538601651.args [ERROR] error: File name too long [ERROR] one error found [INFO] uHi Daniel, thank you for reporting this issue. It looks like you are running into a compiler bug. I updated the Scala compiler used by the framework to the most recent snapshot version, which should fix some bugs. I tested it in a clean install of Ubuntu 9.10 (using OpenJDK) which worked. To test if this solves your problem, please update from the SVN and run \"mvn clean compile\" Cheers Robert On Mon, Mar 15, 2010 at 9:25 PM, Daniel Naber < > wrote: uOn Montag, 15. März 2010, Daniel Naber wrote: I solved this by using a different partition to compile on. The partition on which I experienced the problem was ext4 plus encryption. Regards Daniel" "DBpedia datasets in the main release" "uHi, I made a list of which extractors are used for which languages in the main DBpedia release. Please let us know what datasets you would like to add!!! I am sure there are several useful additions, for example, redirects and inter-language-links should probably be extracted for all languages. The page is basically a human-readable version of dump/extract.default.properties [1]. Cheers, JC [1] extract.default.properties uGood point! I added them for all languages. On Tue, May 15, 2012 at 4:55 PM, Dimitris Kontokostas < > wrote: uOn Tuesday 15 May 2012 22:26:15 Jona Christopher Sahnwaldt wrote: ImageExtractor for Italian will be ready in a couple of days, I'll ping you after as soon as I tested it. It should probably be a one line patch." "full text search" "uHi, experimenting with the full text search feature I'm wondering how to query for two words, e.g. a query for all entries with \"university\" and \"berlin\" in rdfs:label. Is there a solution doing this server-side I just don't see, or do I have to do that client-side? Cheers Georgi" "Dbpedia lookup results in diferent languages." "uHi! I'm using The Dbpedia Lookup Service but whatever is the language of the string I send in the query, it always returns me information in English. I'd like it to be in Portuguese. Is there a way to configure a language for the Dbpedia Lookup results? Thank you! Hi! I'm using The Dbpedia Lookup Service but whatever is the language of the string I send in the query, it always returns me information in English. I'd like it to be in Portuguese. Is there a way to configure a language for the Dbpedia Lookup results? Thank you! uLuciane, You can try to use the /candidates/ interface from DBpedia Spotlight Portuguese. Try spotlight web service\". (From my phone with a small screen) Cheers Pablo On Oct 3, 2013 9:45 PM, \"Luciane Monteiro\" < > wrote: uThank you so much! , now spotlight works perfectly with Portuguese! Authough my question was in fact about how to use Dbpedia Lookup Service with Portuguese, that was such a valuable information, since I need to use Spotlight in Portuguese too. But do you have any idea on how to you the Dbpedia Lookup Service in Portuguese? Thank you so much!!! 2013/10/4 Pablo N. Mendes < > uDBpedia lookup takes as input an entity name and returns a set of candidate entity URIs for that name. This can be accomplished via the /candidates interface of DBpedia Spotlight. See: It does not return all of the fields (e.g. categories) and the XML format is different. But it basically does the job. Answering your question, as far as I know there is no deployed lookup for Portuguese. But you can download the source could and build it yourself in your own machine. Or you can try to convince someone from to deploy it under pt.dbpedia.org Cheers, Pablo On Fri, Oct 4, 2013 at 8:01 AM, Luciane Monteiro < >wrote: uIs there any tutorial to build lookup from the source in Portuguese? Thank you. Hugo Silva On Sat, Oct 5, 2013 at 7:29 AM, Pablo N. Mendes < >wrote: uThank you very much for the answer, but I'm trying to test it here and I do'nt get any answer. The code is the one authored by you (Pablo Mendes), from Github. That's the class that I made some changes ( The part not in bold is the original and the blue one is the part I changed, trying to get some answer)* .* What's the problem here? (Thank you). * public class DBpediaSpotlightClient extends AnnotationClient {* * private final static String API_URL = \" private static final double CONFIDENCE = 0.0; private static final int SUPPORT = 0; * * @Override public List extract(Text text) throws AnnotationException { LOG.info(\"Querying API.\"); String spotlightResponse;* * try {* /* GetMethod getMethod = new GetMethod(API_URL + \"rest/annotate/?\" + \"confidence=\" + CONFIDENCE + \"&support;=\" + SUPPORT + \"&text;=\" + URLEncoder.encode(text.text(), \"utf-8\")); getMethod.addRequestHeader(new Header(\"Accept\", \"application/json\"));*/ * GetMethod getMethod = new GetMethod(API_URL + \"rest/candidate/?\" + \"text=\" + URLEncoder.encode(text.text(), \"utf-8\") + \"&confidence;=\" + CONFIDENCE + \"&support;=\" + SUPPORT); * * getMethod.addRequestHeader(new Header(\"Accept\", \"application/json\")); spotlightResponse = request(getMethod); } catch (UnsupportedEncodingException e) { throw new AnnotationException(\"Could not encode text.\", e); } * * assert spotlightResponse != null; JSONObject resultJSON = null; JSONArray entities = null; try { resultJSON = new JSONObject(spotlightResponse); entities = resultJSON.getJSONArray(\"Resources\"); } catch (JSONException e) { throw new AnnotationException(\"Received invalid response from DBpedia Spotlight API.\"); } LinkedList resources = new LinkedList (); for(int i = 0; i < entities.length(); i++) { try { JSONObject entity = entities.getJSONObject(i); resources.add( new DBpediaResource(entity.getString(\"@URI\"), Integer.parseInt(entity.getString(\"@support\")))); } catch (JSONException e) { LOG.error(\"JSON exception \"+e); } } return resources; } public List getResults(String x) { System.out.println(x); List resultList = new ArrayList (); Text text = new Text(x); System.out.println(text.toString()); try { List response = this.extract(text); for( DBpediaResource dbResource : response ) { resultList.add(dbResource.uri().toString()); } } catch (AnnotationException e) { e.printStackTrace(); } return resultList; } public static void main(String[] args) throws Exception { DBpediaSpotlightClient c = new DBpediaSpotlightClient (); * * Text text = new Text(\"Berlin.\");* * System.out.println(text.toString()); List response = c.extract(text);* * PrintWriter out = new PrintWriter(\"AnnotationText-Spotlight.txt.set\");* * for( DBpediaResource dbResource : response ) { System.out.println( dbResource.getFullUri()); } } }* 2013/10/5 Pablo N. Mendes < > uThe URL path to the service is /rest/candidates/ (observe the s in the end) On Sat, Oct 5, 2013 at 12:11 PM, Luciane Monteiro < >wrote: uHugo, You can try following the directions here: But use PT instead of EN. And you may need to look in the code for where to use BrazilianAnalyzer instead of EnglishAnalyzer. Cheers Pablo On Sat, Oct 5, 2013 at 4:40 AM, Hugo Silva < > wrote: uMy problem is to execute: ./run Indexer lookup_index_dir redirects_en.nt nerd_stats_output.tsv I cant obtain nerd_stats_output.txv :( On Wed, Oct 9, 2013 at 6:41 PM, Pablo N. Mendes < >wrote: uHere: On Wed, Oct 9, 2013 at 11:28 AM, Hugo Silva < > wrote: uSorry, posted to the wrong thread. But, it may be helpful for you too. What you need to do is to run nerd_stats.pig. It is linked from the DBpedia Lookup github page. On Wed, Oct 9, 2013 at 11:34 AM, Pablo N. Mendes < >wrote: uI've tried to run nerd_stats.pig but i have to use amazon web services and i couldn't connect to the account :( Please can you help me? On Wed, Oct 9, 2013 at 7:36 PM, Pablo N. Mendes < >wrote: uI couldnt create an Hadoop cluster :( On Wed, Oct 9, 2013 at 7:57 PM, Hugo Silva < > wrote: uThank you! 2013/10/9 Hugo Silva < >" "counting places" "uKingsley, could you kindly add to live.dbpedia.org the same prefixes as dbpedia.org? Thanks! I'm trying to compare the Places hierarchy (167 classes): against what's available in 1. Transitivity of rdfs:subClassOf is not implemented: this returns only immediate subclasses (17): prefix dbo: select * {?x rdfs:subClassOf dbo:Place} 2. Adding a Kleene closure still returns only 35: prefix dbo: select * {?x rdfs:subClassOf+ dbo:Place} 3. Adding \"order by\" increases to a lot more (170). Kingsley, this looks like a bug in Virtuoso. prefix dbo: select * {?x rdfs:subClassOf+ dbo:Place} order by ?x 4. Why sparql returns 4 more than the wiki? - Department and OverseasDepartment are subclasses of each other, which somehow causes them to be listed twice - Library is subclass of EducationalInstitution < uHi Vladimir, The prefixes have been added to the live.dbpedia.org instance … For the problem queries you report is the correct results being returned from Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // Weblog u2 & 3 return the same as expected Returns 713177, which looks reasonable. uVladimir," "language-specific mappings on the ontology infobox properties" "uHello everyone, I'm writing a program which trys to extract the article name and property from a natural language question (which you enter) and then queries the dbpedia. I'm developing it for the german language and came to a dead-end when I tried to find the link between the natural language term of the property, say \"Geburtsdatum\" (birth date) and it's URI in the ontology infobox properties, which is . So far I'm using the rdf dumps, and my thought was to start at the \"Labels\" file, which provides the URI for the Literal \"Geburtsdatum\", take that URI to the \"inter-language links\" file, which provides the URI . Where can I find the missing link to the ontology-namespace URI? Thanks in advance. Kind regards, Christoph Lauer uHi Christoph, You downloaded the instance labels. I think you want the ontology labels. Here: Cheers, Pablo On Mon, May 14, 2012 at 5:13 PM, Christoph Lauer < > wrote: uHi Pablo, The problem with the instance labels is that a lot of the Ontology properties don't have labels in a language or another. Wouldn't it be a good idea to also export all the mapped Infobox properties together with the ontology as owl:equivalentProperty (s), because a) the labels might not always be present and b) they might not always be spelled the same way in both places. Cheers, Alexandru On 05/14/2012 05:41 PM, Pablo Mendes wrote: uSounds good! Not hard to implement either: 1. Define a new Dataset constant. 2. Write code that 2a Loads all mappings (with MappingsLoader) 2b Iterates through all mappings and generates Quad objects (ontology property has equivalent template property). Make sure that ALL applicable property mappings are used: - Not just class TemplateMapping, but also ConditionalMapping etc. - Not just class SimplePropertyMapping, but maybe also CalculatePropertyMapping and a few others. I'm not sure though if that's necessary / useful. 2c Sorts by ontolology property name. 2d Sends the quads to a Destination On Mon, May 14, 2012 at 6:56 PM, Alexandru Todor < > wrote:" "What exactly mean create Dbpedia lookup index" "uHi, I have a local copy of Dbpedia and now I want to create a Lookup of this one, like the dbpedia lookup at because I have some custom data on my copy of Dbpedia. What I must do for create my lookup from my dataset? I think I must rebuild the index like what I think is correct. Sorry but I‘m a newbie in dbpedia world. Please help me. Thank you. Hi, I have a local copy of Dbpedia and now I want to create a Lookup of this one, like the dbpedia lookup at you. uPlease someone answer me!! I need to know this! Thank you. 2013/7/17 Giosia Gentile < > Please someone answer me!! I need to know this! Thank you. 2013/7/17 Giosia Gentile < > Hi, I have a local copy of Dbpedia and now I want to create a Lookup of this one, like the dbpedia lookup at you. uHi Giosia, In which exactly part of the step-by-step instructions [1] did you get stuck? Cheers, Dimitris [1] On Mon, Jul 22, 2013 at 6:33 PM, Giosia Gentile < >wrote: uOn Tue, Jul 23, 2013 at 8:58 AM, Giosia Gentile < >wrote: Maybe the Spotlight guys can answer that Dimitris uOn Tue, Jul 23, 2013 at 8:58 AM, Giosia Gentile < > wrote: This refers to this program and specifically this Pig Latin script: You need to run it with the Wikipedia dump of your choice and feed the TSV output to the Lookup indexer. Cheers, Max" "would like to show show my colleagues DBpedia" "u uForsberg, Kerstin L wrote: Kerstin, Here are a few other DBpedia demonstrations that you should find relevant: 1. A Collection of DBpedia Queries I've built and saved for public consumption (just click on an of the links) 2. An Interactive Query Builder for SPARQL (iSPARQL) that enables: - saving and sharing of Query Definitions (\".rq\" files) e.g. DBpedia SPARQL Query about TimBL - saving and sharing of Linked Data Pages (\".isparql\" files) e.g. DBpedia Snapshot of TimBL 3. OpenLink RDF Browser (for interacting with URIs) that also enables: - saving and sharing of Browser Sessions (\".wqx\") files e.g. DBpedia data about Silicon Valley Companies 4. A Basic SPARQL Query Interface and Tutorial based on the Data Access Workgroups (DAWG) SPARQL Testsuite. In the coming weeks I am going to try to spend a little more time demonstrating the power of Open Access to Linked Data from the perspective of creating and sharing \"Context\" (Information) via Semantic Web Data Spaces etc Note: - The Linked Data Pages (.isparql files) and Query Definitions can be loaded into the live iSPARQL Query Builder and used as the basis for other queries. Just use the File|Open Menu sequence (or click on the File Open Toolbar icon). - The Linked Data Pages have an Ajax based enhanced hyperlink feature (what we call ++ internally) that enables a variety of interactions with a URI (one of the critical differences between Document Web and Data Web traversal). This also means that the pages will not work with IE6 or Safari (WebKit works fine). All other browsers across all relevant platforms will work (i.e. Opera, Firefox 1.5+, Camino, Webkit, IE7 etc). - The RDF Browser's SVG based Graph Visualizer will not work with IE (any version) since IE doesn't support SVG. uOn 3 Jun 2007, at 10:43, Forsberg, Kerstin L wrote: The SPARQL Explorer (a.k.a. “Snorql”) doesn't work in IE. A known issue. What happens when you try to access the domain? This is just a plain old web server serving static HTML, it works great in IE for me. Richard uThanks Richard, don't now what why - but now I can access the dbpedia.org from our different computers at home - I really like the Snorql interface when I test it on my Mac!! However, I do have problems replicate the query example \"People who were born in Berlin before 1900\" using the Leipzig query builder ( Kerstin" "Subjects without labels in the domain of rdf:type" "uI'm loading dbpedia ontology type assignments from 3.5.1 into my system and noticing a significant number of subjects that don't have labels Here are just a few I saw going by: . . . . . . . . . Some interesting stuff seems to be going on here, and I'd like to see it better documented For a long time there's been the problem that some named entities have their own DBpedia resource and others don't because of the accidental nature of how things work on Wikipedia. Here's an example of two fictional characters that play an isomorphic role in two very similar works: The perfect \"generic database\" would treat these two entities in the same way And it looks like dbpedia is starting to address this. has two engines listed in the Infobox, and it appears the extractor is parsing the \"name\" of the engine to extract some nice facts: On the other hand, this one looks like doesn't have such clear value It seems to be identifying a \"facet\" of the boat, that is, its career as a part of the U.S. National Register of Historic Places. The value of this isn't so clear to me, but perhaps there is some sense to it. The oddly named dbpedia-owl:added field applies to this career, so maybe this makes sense. The two PersonFunctions up there look specious, particularly because I don't see dbpedia-owl:Writer dbpedia-owl:Actor dbpedia-owl:Comedian dbpedia-owl:MusicalArtist anywhere near the Person or PersonFunction instances. That said, I do see some places where \"PersonFunction\" has promising roles to play: the functionEndData and functionEndYear properties are obviously useful. There's an issue of how these new \"synthetic\" identifiers (that don't correspond 1-1 to wikipedia pages) map to external resources. \"PersonFunction\" is similar (but probably not equivalent) to the \"Employment Tenure\" CVC in Freebase. Certainly some of these new synthetic identifiers correspond to resources in Freebase as well, since Freebase sometimes splits wikipedia topics into multiple entities. Note also looking at I see a lot of properties that just don't make any sense at all, such a \"wavelength\", \"sourceConfluenceMountain\", \"shipDraft\". I don't know if this some artifact of how the whole system works (looks like these properties all have a domain of \"owl:Thing\") but it does shock my sensibilities. (From a strictly owl wavepoint, I guess that something that has a \"wavelength\" is a thing, but by that standard, anything that has any property at all is a \"Thing\")" "Formatting Dbpedia query" "uI want to ask a question from dbpedia using Jena code. There is a GUI button in eclipse \"Solve question\" when clicked JEna code is executed and SPARQL query runs against dbpedia. The question is what is capital city of Germany and have four options. My query is like this: select distinct ?rightAnswer ?wrongAnswer where { dbr:Germany dbo:capital ?rightAnswer . ?s dbo:capital ?wrongAnswer . filter ( ?s != dbr:Germany && ?wrongAnswer != ?rightAnswer )} limit 4 It gives me answer like shown in table below: I want answer in format like this so that user can select one name: .Berlin .Padeborn .Budapest .Stanley How can I achieve this? rightAnswerwrongAnswer Budapest u1. You need to get the name. IMHO the best results come from dbp:officialName: select distinct ?rightAnswer ?wrongAnswer ?wrongName where { dbr:Germany dbo:capital ?rightAnswer . ?s dbo:capital ?wrongAnswer . filter ( ?wrongAnswer != ?rightAnswer ) ?wrongAnswer dbp:officialName ?wrongName filter(lang(?wrongName)=\"en\") } limit 4 - rdfs:label is the page name, which often includes the country name, which defeats the purpose of your quiz - foaf:name returns many per place, and some are not in EN. Note: probably not all have dbp:officialName, but enough do 2. You may want to limit to *country* capitals: select distinct ?rightAnswer ?country ?wrongAnswer ?wrongName where { dbr:Germany dbo:capital ?rightAnswer . ?country a dbo:Country; dbo:capital ?wrongAnswer . filter ( ?wrongAnswer != ?rightAnswer ) ?wrongAnswer dbp:officialName ?wrongName filter(lang(?wrongName)=\"en\") } limit 4 Note: you get some weird \"Countries\", e.g. Cricket_Samoa, which is their cricket organization. The reason is that someone made that page using {{Infobox country}} index.php?title=Cricket_Samoa&action;=edit" "Lookup service results for "building"" "uHi, I'm using the dbpedia lookup service and I'm surprised I do not get Is it the expected behavior? In that case, what should be my query in order to get that resource? Cheers, María uHi María, yes, it is strange. The wikipedia page exists, also exists in DBpedia as you mentionMay be the index is out of date. I do not know the update policy. Perhaps people responsible for the maintenance of the service (Pablo, Max, Matt) could give you some hint. Best regards, uHi María, the service is working as expected. The results are ordered by RefCount (i.e., how often the respective Wikipedia article is linked from another one) in descending order. By default, the service returns the top 5 results, and Try and you will find what you are looking for. Best, Heiko Am 23.04.2014 12:14, schrieb Mariano Rico: uThanks Heiko, that makes things more clear. For some reason I was expecting that resource to be among the first results as the string matching between the input and the resource is higher. Is it considered during the calculations as well or it is based just in the RefCount? Just out of curiosity. Regards, María On Mon, Apr 28, 2014 at 10:06 AM, Heiko Paulheim < > wrote: uHi María, I am not a total expert here, but as far as I know, the sorting is only done based on RefCount. In most cases, this works reasonably well, but there are some odd effects just as the one that you found. Best, Heiko Am 28.04.2014 10:15, schrieb María Poveda:" "SPARQL query with dbpedia-owl:birthDate" "uHi All, I am new to \"dbpedia\";  I  have the below  SPARQL query to get all Athletes whose birthdate is \"1988-08-29\"; I get a set of athletes when I  do curl, but its not complete set since I am expecting an athlete but he is not part of the response.     { ?subject rdf:type foaf:Person;  rdf:type dbpedia-owl:Athlete; dbpedia-owl:birthDate '1988-08-29'^^xsd:date }}  Response: dbpprop:birthDate : 1988-08-29 (xsd:date)  I am doing anything wrong here?  Thanks -Siva Hi All, I am new to \"dbpedia\"; I have the below SPARQL query to get all Athletes whose birthdate is \" 1988-08-29\"; I get a set of athletes when I do curl, but its not complete set since I am expecting an athlete but he is not part of the response. name=\"subject\"> name=\"subject\"> name=\"subject\"> name=\"subject\">http://dbpedia.org/resource/Carys_Hawkins http://dbpedia.org/resource/Grzegorz_Zengota http://dbpedia.org/resource/Mladen_Popovi%C4%87 http://dbpedia.org/resource/Jorge_Ibarra http://dbpedia.org/resource/Fritz_Lee http://dbpedia.org/resource/Stephanie_Proud http://dbpedia.org/resource/Bryan_de_Hoog http://dbpedia.org/resource/Tungalagiin_M%C3%B6nkhtuyaa Look at this page: http://dbpedia.org/page/Agim_Ibraimi dbpprop: birthDate : 1988-08-29 (xsd:date) I am doing anything wrong here? Thanks -Siva uHi Siva, Seems to me you're querying the DBpedia Live endpoint, and the Agim Ibraimi resource there simply doesn't have the information you're requesting associated to it ( Perhaps you could try querying Best, UroÅ¡ On 28.3.2014 13:09, s.siva kumar wrote: uSiva, UroÅ¡ is right. An additional remark: there's an important difference between the dbpedia.org/property/ (\"dbpprop\") and dbpedia.org/ontology/(\"dbpedia-owl\") namespaces: In your mail, you're getting them mixed up a bit. JC On Mar 28, 2014 1:54 PM, \"Uros Milosevic\" < > wrote:" "build error in dbpedia release 3.5.1" "uHi all, i try running the dbpedia extraction framework (the release of Dbpedia 3.5.1 which written in scala ) i have one build error .I hope anyone help me in this error i do all steps in the problem appears when running mvn scala:run i have the following error Failed to resolve artifact.   Missing: 1) org.dbpedia.extraction:core:jar:2.0 the complete error i get is: [INFO] Scanning for projects [INFO] snapshot org.scala-tools:maven-scala-plugin:2.13.2-SNAPSHOT: checking for  updates from our archiva [WARNING] repository metadata for: 'snapshot org.scala-tools:maven-scala-plugin: 2.13.2-SNAPSHOT' could not be retrieved from repository: our archiva due to an e rror: Error transferring file: Connection reset [INFO] Repository 'our archiva' will be blacklisted [INFO]" "property domain and range values" "uHi, I have a question regarding some properties in DBpedia. I see some properties have the same name with slightly different URL s. For example look at the following properties, Why there are properties for the same cause but with different property values? In this case the only difference is in the URL middle part where we have \"property\" and \"ontology\". When I try to get domain and range values for these properties, the properties with \"ontology\" as a part of the URL has domain and range values (retrieved by sparql). Can somebody tell me why there are two kinds of properties for the same meaning and only one type has domain and range values set? Thank you. Kalpa P {margin-top:0;margin-bottom:0;} Hi, I have a question regarding some properties in DBpedia. I see some properties have the same name with slightly different URL s. For example look at the following properties, with 'ontology' as a part of the URL has domain and range values (retrieved by sparql). Can somebody tell me why there are two kinds of properties for the same meaning and only one type has domain and range values set? Thank you. Kalpa uThe properties starting with: are extracted \"as-is\" from Wikipedia. The properties starting with are mapped: for details see : All the best, Sebastian On 04/20/2012 06:08 AM, Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva wrote: uThis should also have been answered in our FAQs: Cheers Pablo On Apr 20, 2012 6:25 AM, \"Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva\" < > wrote:" "Mapping and redirected infoboxes" "uHi all, I noticed there is a problem with redirected infoboxes and the test extraction. If I create a mapping for infobox A which is redirected from infobox B, all the entities which actually use infobox B will not get mapped in the test extraction. Example: extracted (among all). Checking the wikipedia article [1] I see that it uses Infobox ski jumper [2] which redirects to Infobox skier [3] Would it be possible to correct the test extraction framework? Moreover, the mapping does not show up in live dbpedia [4]. Not even for those which are mapped in the test extraction, e.g. [5] Also, on [5] it looks like the information about the Template:Infobox_skier disappeared, while it is present in default dbpedia [6]. Do you know what is going on? Thanks, Andrea [1] [2] [3] [4] [5] [6] Anne_Heggtveit uHi Andrea, On 12/29/2012 12:46 PM, Andrea Di Menna wrote: You are right in that, we will check that issue. I can see it on DBpedia-Live for [5]. The template \"Template:Infobox_skier\" is also there. uHi All, Is there solution for this problem? I also have the same problem.   Regards, Riko Hi Andrea, On 12/29/2012 12:46 PM, Andrea Di Menna wrote:" "Sitemap for DBpedia 3.1" "uHi, I notice that the sitemap at yet been updated for DBpedia 3.1. I would like to provide an updated sitemap. What changes are required to bring the sitemap up to date? Is it just changing all the download locations from /3.0/ to /3.1/? Or did some of the filenames change? Is the list of dumps that are loaded into Virtuoso any different in 3.1 than in 3.0? Cheers, Richard uHello, Richard Cyganiak wrote: There has been a name change in the yago files and the Geo extractor is run for all languages. I believe for all other files only the version number needs to be changed. Below is the list of files loaded into the official SPARQL endpoint (@Zdravko: copied from your mail on August 8, please correct me if anything has changed). Kind regards, Jens links_uscensus_en.nt links_revyu_en.nt links_quotationsbook_en.nt links_musicbrainz_en.nt links_gutenberg_en.nt links_geonames_en.nt links_factbook_en.nt links_eurostat_en.nt links_dblp_en.nt links_cyc_en.nt links_bookmashup_en.nt infoboxproperties_en.nt infobox_en.nt disambiguation_en.nt categories_label_en.nt articles_label_en.nt articlecategories_en.nt longabstract_pl.nt longabstract_no.nt longabstract_nl.nt longabstract_ja.nt longabstract_it.nt longabstract_fr.nt longabstract_fi.nt longabstract_es.nt longabstract_en.nt longabstract_ru.nt longabstract_pt.nt longabstract_zh.nt longabstract_sv.nt shortabstract_de.nt redirect_en.nt persondata_en.nt shortabstract_pt.nt shortabstract_pl.nt shortabstract_no.nt shortabstract_nl.nt shortabstract_ja.nt shortabstract_it.nt shortabstract_fr.nt shortabstract_fi.nt shortabstract_es.nt shortabstract_en.nt wikipage_ja.nt wikipage_it.nt wikipage_fr.nt wikipage_fi.nt wikipage_es.nt wikipage_en.nt wikipage_de.nt wikicompany_links_en.nt skoscategories_en.nt shortabstract_zh.nt shortabstract_sv.nt shortabstract_ru.nt yagolink_en.nt yago_en.nt wordnetlink_en.nt wikipage_zh.nt wikipage_sv.nt wikipage_ru.nt wikipage_pt.nt wikipage_pl.nt wikipage_no.nt wikipage_nl.nt longabstract_de.nt image_en.nt geo_zh.nt geo_sv.nt geo_ru.nt geo_pt.nt geo_pl.nt geo_no.nt geo_nl.nt geo_ja.nt geo_it.nt geo_fr.nt geo_fi.nt geo_es.nt geo_en.nt geo_de.nt flickr_en.nt homepage_fr.nt homepage_en.nt homepage_de.nt externallinks_en.nt uJens, Thanks for the list! On 20 Aug 2008, at 11:45, Jens Lehmann wrote: Okay. I created an updated sitemap, it is attached, please publish it on dbpedia.org so we can start the Sindice indexer on the 3.1 dataset. (I verified that all the dump URIs in the list are dereferenceable (no 404s), but I don't have a way of checking if the list is complete.) Cheers, Richard xml version='1.0' encoding='%SOUP-ENCODING%' > uJens, Sören, I sent an updated DBpedia sitemap almost three weeks ago, see below. It is still not published. May I ask what the obstacle is? Can I help in any way? Thanks, Richard Begin forwarded message: uRichard Cyganiak wrote: Its very simple - obviously Jens and me overlooked your email. Sorry for that, the updated sitemap is live now. Sören" "Dbpedia game" "uHello sir What are the dbpedia classes which are specific to games? How can we query it? Regards Hello sir What are the dbpedia classes which are specific to games? How can we query it? Regards" "German Geo Coordinates Missing" "uHi There, I'ld like to work with geo coordinates but the german ones are missing. The Extracted size is smaller than 10kb. I could use the Infobox Properties, but they are way to large for my work. Any idea why there are no german data or where to gather them? Best regards Chris uHi, It's a known problem. The geocoordinates extractor does not seem to pick up most of the geo coordinates in the German dumps. I'll try again to fix it when I have time, most of the geo coordinates are mapped right now and extracted. In the mean time: pick up the mappingbased-properties file from 20.03.2014 and extract the geo coordinates with grep 'wgs84_pos' dewiki-20140320-mappingbased-properties.nt > geocoordinates.nt (I already placed a pre-extracted file in the same folder for you). Cheers, Alexandru Alexandru-Aurelian Todor Freie Universität Berlin Department of Mathematics and Computer Science Institute of Computer Science AG Corporate Semantic Web Königin-Luise-Str. 24/26, room 116 14195 Berlin Germany [1] On 04/01/2014 03:50 PM, Christian Becker wrote:" "help" "uDear everyone: Now I encounter some big problems with the DBpedia information extraction framework. I do exact steps which are showed in DBpedia & Eclipse Quick Start Guide. But I cannot open the source files. The eclipse tell me: Could not open the editor: extraction does not exist. Java Model Exception: Java Model Status [extraction does not exist] I think there is something wrong with the m2eclipse. I want to run this project in local. But I do not know how to set it. if know this solution or not pick up what I said, you can ask me for further description. Bobby" "Odp: Odp: probably incorrect mapping to schema.org from MusicalArtist" "uHi Aldo, thanks for the references - I will definitely look into them. I know about the systematic polysemy phenomenon - e.g. sentences where you use two meanings of one polysemous word are semantically valid. But I am not saying that you have to make a commitment to one classification of a certain \"object\" that is described in Wikipedia, e.g. church as a building vs church as a community. What I am saying is that when you use DBpedia ontology, where the classes are vaguely defined you are indeed making some commitments (as Peter pointed out) by interpreting churches as a buildings, while in Cyc there are such distinctions, but there are also relations that keep these interpretations together. What I am mostly concerned with is the process of developing DBpedia ontology. I have pointed out inconsistencies in Wikipedia which are a by-product of its community based development. Having 5 or 6 means of expressing one information (e.g. administrative categories) is not a problem of polysemy, but is a problem of a community driven process. Assuming DBpedia ontology creators will agree that a church is a systematically polysemous concept I don't expect that all systematically polysemous concepts will be treated as such. My belief is based on my careful examination of Wikipedia (incoherent) structure and the problems discovered in the DBpedia ontology that were discussed so far. But you may be right in yet another sense - there is no much semantics in SemanticWeb. At least - if you look at DBpedia from God's perspective you are making a mistake. There are only portions of the data that are useful in certain applications and you have to adjust them to your \"ontology\" before you start using it. Kind regards, Aleksander" "dump problems" "uHi, i'm currently struggling with the DBpedia 3.7 dumps, but as the DBpedia 3.8 seems to be on the road i thought i'd let you know about some of the problems i encountered by now which make it tricky to work with the dumps. I downloaded the 3.7 all_languages.tar and the all_languages-i18n.tar and the interlang-i18n.tar . The problems i encountered are mainly related to the i18n dataset, but can also be found in the interlanguage links of the \"traditional\" dump (in the all_languages.tar): It seems as if the de, el and ru wikipedia were exported in a different way from other languages, as their encodings are different and they have the language prefix in the URIs (de.dbpedia.org). They use UTF-8 IRIs in the dump files, while all other languages i tried use % escaped URIs and don't have a language prefixed URI. This leads to a couple of encoding issues (.nt files are ASCII only and normally use % encoding, but de, el and ru contain UTF-8 here) and inconsistencies (e.g., interlanguage links to fr.dbpedia.org pointing into nothing). Also the interlanguage link files show different ways of encoding within the same file, making them tricky to load. Details where you can see these issue: (the following is all related to the 3.7-i18n dumps!) File Encoding: Mainly the de, el and ru, as well as the interlanguage_link dumps contain .nt files with UTF-8 encoding (for IRIs). This isn't valid .nt, so maybe consider renaming all those files to .n3? Example: from de/labels_de.nt.bz2: \"Anschlussf\u00E4higkeit\"@de . from fr/lables_fr.nt.bz2: \"Alg\u00E8bre g\u00E9n\u00E9rale\"@fr . Encoding & escaping: The de, el and ru IRIs seem to be UTF-8 encoded and () seem to be \"unescaped\". All other languages I tried seem to use % encoded URIs and escape the brackets with %28 / %29. Example: from interlang-i18n/el/interlanguage_links_el.nt.bz2 . . . . . Inter language links: en and de just link to each other and el. ( minor: interlanguage_links_en.n3.bz2 seems to have 3082 triples of this form: ?s owl:sameAs ?s ) All other languages seem to have owl:sameAs to all languages. Prefixing dbpedia.org with language codes: de, el and ru files contain URIs like de.dbpedia.org, el.dbpedia.org, ru.dbpedia.org. fr, es, nl, (all others i tried) lack the prefixes in all data files i tried, but the interlanguage_links_en.n3 show them. This leads to all interlanguage links pointing to Example: from fr/lables_fr.nt.bz2: \"Alg\u00E8bre g\u00E9n\u00E9rale\"@fr . from interlang-i18n/fr/interlanguage_links_fr.nt.bz2: . . I'd happily offer a hand to help fixing these issues in 3.8. Cheers, Jörn PS: yes, i just wanted to update uHi Jörn, most of these problems should be fixed in the current development version of the code. 1. Text encoding Starting with 3.8, we will be able to generate all kinds of different file formats, mainly N-Triples, N-Quads and Turtle, but also TriX. The N-Triples and N-Quads files will be ASCII only, as required by the spec. Of course, that means they are hardly human-readable for non-Latin languages. The Turtle files use no special Turtle features like @prefix, they are basically N-Triples files that use UTF-8 encoding. They are human human-readable for all languages, because only some special ASCII characters have to be escaped in Turtle. We also produce turtle-quads files. I don't think there is a formal specification for that format - it's basically N-Quads, but with Turtle rules for \u escaping, i.e. using UTF-8 instead of most \u escapes. If you still find any encoding problems, please let us know. The Turtle rules may not be implemented correctly for higher Unicode planes. 2. URI escaping Also starting with 3.8, there will be a very simple configuration switch to chose between IRIs (only a few special ASCII chars are percent-escaped) and URIs (special ASCII chars and all non-ASCII chars are percent-escaped). Both can be written to different files during one extraction run. We will probably have to chose 'canonical' DBpedia URIs/IRIs though, so we may not publish both versions. In both cases, round brackets \"()\" will not be escaped because the RFCs for URIs and IRIs do not mandate it. I hope this won't cause too many backwards-compatibility problems. If it does, it's trivial to change this behavior since I we now have our own configurable URI encoder. Again, the problems should be fixed, but especially for higher Unicode planes, IRI escaping may not be quite correct. We're not sure yet which of these files we will publish. There are twelve different format combinations (quads/triples, URIs/IRIs, NT/Turtle, etc), and it doesn't make sense to publish them all. In the past, we only published NT and NQ files, but I would like to also offer Turtle. 3. Inter language links and 4. DBpedia language domains are fodder for a separate mail So much for now, Christopher On Mon, May 7, 2012 at 12:19 PM, Jörn Hees < > wrote:" "DBPedia RDF Data - Only English" "uHi,   You mean all the files from: right?   If I want to store whole English DBPedia in RDF store in NT format, I should store all following 40 nt files, right?    article_categories_en.nt.bz2 29-Jun-2012 02:48  144M    category_labels_en.nt.bz2 29-Jun-2012 09:17  9.8M    contents-nt.txt 01-Aug-2012 11:26  2.5K    disambiguations_en.nt.bz2 25-Jul-2012 11:03  9.7M    disambiguations_unredirected_en.nt.bz2 29-Jun-2012 04:08  9.6M    external_links_en.nt.bz2 29-Jun-2012 04:27  130M    geo_coordinates_en.nt.bz2 29-Jun-2012 13:53  13M    homepages_en.nt.bz2 29-Jun-2012 13:55  8.0M    images_en.nt.bz2 29-Jun-2012 00:19  77M     infobox_properties_en.nt.bz2 25-Jul-2012 11:13  479M    infobox_properties_unredirected_en.nt.bz2 29-Jun-2012 08:32  479M    infobox_property_definitions_en.nt.bz2 29-Jun-2012 06:54  644K    infobox_test_en.nt.bz2 29-Jun-2012 09:11  199M    instance_types_en.nt.bz2 29-Jun-2012 13:09  59M    interlanguage_links_en.nt.bz2 25-Jul-2012 15:17  132M    interlanguage_links_same_as_chapters_en.nt.bz2 25-Jul-2012 11:14  61M    interlanguage_links_same_as_en.nt.bz2 25-Jul-2012 15:20  119M    interlanguage_links_see_also_chapters_en.nt.bz2 25-Jul-2012 11:14  2.3M    interlanguage_links_see_also_en.nt.bz2 25-Jul-2012 15:22  4.4M    iri_same_as_uri_en.nt.bz2 25-Jul-2012 17:42  10M    labels_en.nt.bz2 25-Jul-2012 11:16  134M    long_abstracts_en.nt.bz2 25-Jul-2012 11:18  644M    mappingbased_properties_en.nt.bz2 25-Jul-2012 11:25  175M    mappingbased_properties_unredirected_en.nt.bz2 29-Jun-2012 13:52  176M    page_ids_en.nt.bz2 27-Jul-2012 23:00  122M    page_links_en.nt.bz2 25-Jul-2012 16:58  1.1G    page_links_unredirected_en.nt.bz2 29-Jun-2012 10:57  1.1G    persondata_en.nt.bz2 25-Jul-2012 11:25  42M    persondata_unredirected_en.nt.bz2 29-Jun-2012 13:26  42M    pnd_en.nt.bz2 29-Jun-2012 09:15  23K     redirects_en.nt.bz2 12-Jul-2012 12:09  75M    redirects_transitive_en.nt.bz2 12-Jul-2012 11:00  92M    revision_ids_en.nt.bz2 27-Jul-2012 23:09  137M    revision_uris_en.nt.bz2 27-Jul-2012 23:18  184M    short_abstracts_en.nt.bz2 25-Jul-2012 11:27  344M    skos_categories_en.nt.bz2 29-Jun-2012 00:21  25M    specific_mappingbased_properties_en.nt.bz2 29-Jun-2012 08:06  6.0M    topical_concepts_en.nt.bz2 25-Jul-2012 18:32  913K    topical_concepts_unredirected_en.nt.bz2 09-Jul-2012 18:37  900K    wikipedia_links_en.nt.bz2 29-Jun-2012 09:32  200M    Thanks.     From: Paul Wilton < > To: Vishal Sinha < > Cc: \" \" < > Sent: Thursday, February 7, 2013 6:38 PM Subject: Re: DBPedia RDF Data - Only English On Thu, Feb 7, 2013 at 12:02 PM, Vishal Sinha < > wrote: Hi, 184M short_abstracts_en.nt.bz2 25-Jul-2012 11:27 344M skos_categories_en.nt.bz2 29-Jun-2012 00:21 25M specific_mappingbased_properties_en.nt.bz2 29-Jun-2012 08:06 6.0M topical_concepts_en.nt.bz2 25-Jul-2012 18:32 913K topical_concepts_unredirected_en.nt.bz2 09-Jul-2012 18:37 900K wikipedia_links_en.nt.bz2 29-Jun-2012 09:32 200M Thanks. From: Paul Wilton < > To: Vishal Sinha < > Cc: \" \" Sent: Thursday, February 7, 2013 6:38 PM Subject: Re: DBPedia RDF Data - Only English Thanks. uHi Vishal, if what is present in the public DBpedia endpoint is enough, have a look at which datasets are loaded there: Cheers, Anja Am 07.02.2013 um 15:43 schrieb Vishal Sinha:" "XML header information in SPARQL results" "uHi everyone, I've been trying to use the SPARQL endpoint at with! Thanks so much for this work. I have a small question regarding the formatting of the SPARQL query results. According to the SPARQL Query Results XML Format specification, section 2.1 [1], the results document should be a valid XML documents as follows: xml version='1.0' encoding='%SOUP-ENCODING%' However, the results I get from the endpoint omit the first line. Any reason why? It's a minor issue, but it could break code that accesses several different SPARQL endpoints, many of which do include that line. It just did mine. :) Thanks, Anupriya [1] uAnupriya, On 4 Jan 2008, at 23:47, Anupriya Ankolekar wrote: Thanks for the kind words. This first line (called the \"XML declaration\") is optional. As you know, SPARQL results are a kind of XML document, and the XML spec doesn't require this line [1]. Do you use an XML parser to read the results? I would be surprised if any XML parser choked on a missing XML declaration. Best, Richard [1] uHi Richard, I didn't realise the XML declaration is optional (my bad), so thanks for pointing it out! I was using a home-grown XML parser. However, since it is recommended by the specification (a SHOULD but not a MUST), it might still be good to include the line. Openlink has already implemented the change in their Virtuoso server. (Fast work!) Best, Anupriya On Jan 6, 2008 11:08 AM, Richard Cyganiak < > wrote:" "MapReduce expert needed to help DBpedia [as GSoC co-mentor]" "uDear all, We want to adapt the DBpedia extraction framewok to work with a MapReduce framework. [1] We want to implement this idea through GSoC 14 and already got two interested students [2] [3]. Unfortunately we are not experienced in this field and our existing contacts could not join. Thus, we are looking for someone to help us mentor the technical aspects of this project. About GSoC ( The *Google Summer of Code* (*GSoC*) is an annual program, first held from May to August 2005,[1] in which Google awards stipends (of US$5,500, as of 2014) to all students who successfully complete a requested free and open-source software coding project during the summer. See some additional info on our page [4] Best, Dimitris [1] [2] student #1 [3] student #2 [4] gsoc2014?v=kx0#h358-6" "Academic Vacancies - CS Dept - University of Cyprus" "uANNOUNCEMENT OF ACADEMIC POSITION The University of Cyprus invites applications for one (1) tenure-track academic position at the rank of Lecturer or Assistant Professor. DEPARTMENT OF COMPUTER SCIENCE One position at the rank of Lecturer or Assistant Professor in the field of: \"Computer Security\"with emphasis in: - Operating System Security or - Internet Security or - Web Security or - Network Security For all academic ranks, an earned Doctorate from a recognized University is required. Requirements for appointment depend on academic rank and include: prior academic experience, research record and scientific contributions, involvement in teaching and in the development of high quality undergraduate and graduate curricula. The minimum requirements for each academic rank can be found at the webpage: The official languages of the University are Greek and Turkish. For the above position knowledge of Greek is necessary. Holding the citizenship of the Republic of Cyprus is not a requirement. In the case that the selected candidate does not have sufficient knowledge of the Greek language, it is the selected candidate's and the Department's responsibility to ensure that the selected academic acquires sufficient knowledge of the Greek language within 3 years, since the official language of teaching at the University is the Greek language. It is noted that each Department sets its own criteria for the required level of adequacy of knowledge of the Greek language. The annual gross salary (including the 13th salary) is: Lecturer (Scale A12-A13) 43,851.21 Euro - 71,358.82 Euro Assistant Professor (Scale A13-A14) 57,694.26 Euro - 77,811.11 Euro Applications must be submitted by Thursday, 12th of March 2015. The application dossiers must include two (2) sets of the following documents in printed and electronic form (i.e., two (2) hardcopies and two (2) CDs with the documents in PDF (Portable Document Format) or Word files). I. Cover letter stating the Department, the field of study, the academic rank(s) for which the candidate applies, and the date on which the candidate could assume duties if selected. II. Curriculum Vitae. III. Brief summary of previous research work and a statement of plans for future research (up to 1500 words). IV. List of publications. V. Copies of the three most representative publications. VI. Copies of Degree certificates should be scanned and included in the CDs. VII. Names and contact details for three academic referees. Applicants must ask three academic referees to send recommendation letters directly to the University. The names and contact details of these referees must be indicated in the application, because additional confidential information may be requested. The recommendation letters must reach the University by Thursday, 12th of March 2015. The Curriculum Vitae and the statements of previous work and future research plans should be written in Greek or in Turkish, and in one international language, preferably English. Selected candidates will be required to submit copies of degree certificates officially certified by the Ministry of Education (for certificates received from Universities in Cyprus) or from the Issuing Authority (for foreign Universities). Applications, supporting documents and reference letters submitted in response to previous calls in the past will not be considered and must be resubmitted. Incomplete applications, not conforming to the specifications of the call, will not be considered. All application material must be submitted, in person, to: Human Resources Service University of Cyprus University Campus Council/Senate Anastasios G Leventis Building P.O. Box 20537 1678 Nicosia, Cyprus Tel. 22894158/4155 by Thursday, 12th of March 2015, 2:00 p.m. Alternatively, applications can be sent by post; they will be accepted as valid as long as the sealed envelopes are post-marked before the deadline of March 12, 2015, and they reach the Human Resources Service by March 19, 2015, on the sole responsibility of the applicant. For more information, candidates may contact the Human Resources Service (tel.: 00357 22894158/4155) or the Department of Computer Science ( , 00357 22892700/2669)." "Wikipedia dump corresponding to DBpedia 3.7" "uI need the wikipedia dump from 2011-07-22 from which DBpedia 3.7 was extracted. It is no longer available from the official Wikipedia dumps page. Can you please point me to a non-torrent version of it. Thanks, Mohamed I need the wikipedia dump from 2011-07-22 from which DBpedia 3.7 was extracted. It is no longer available from the official Wikipedia dumps page. Can you please point me to a non-torrent version of it. Thanks, Mohamed" "Invalid Turtle syntax in Freebase dumps" "uAll, There is a syntax problem with the Freebase dump. I don't know if the dump was prepared by the DBpedia team or by someone at MetaWeb, can anyone tell me who made it? Anyway, here is the problem: The Freebase dump at this: @prefix owl: . @prefix fb: . @prefix dbpedia: . dbpedia:%21%21%21 owl:sameAs fb:guid.9202a8c04000641f80000000002d1e19 . dbpedia:%21%21%21Fuck_You%21%21%21 owl:sameAs fb:guid.9202a8c04000641f8000000000bb290d . That's invalid Turtle, QNames are not allowed to include %-encoded characters. This means Sindice staff has to manually exclude this dump when indexing DBpedia. The fix would be simply to use full URIs: instead of dbpedia:%21%21%21 . I logged a bug here: Kudos to Nickolai for finding and verifying this bug. Best, Richard uRichard Cyganiak wrote:" "Different results depending on values of the LIMIT clause" "uHi, This is probably a problem with Virtuoso on the SPARQL access point ( depending on the values of the LIMIT clause. The LIMIT clause should have an effect on the number of results of a query. However, I think it should not filter out results, as it seams to happen in the following example. I want to retrieve the genres of all musical artists. The following query PREFIX rdf: PREFIX dbpedia: SELECT ?Artist ?Genre WHERE { ?Artist rdf:type dbpedia:MusicalArtist; dbpedia:genre ?Genre . } returns too many results hence I used OFFSET and LIMIT to split them into manageable subsets. To have predictable sets of results I used ORDER BY on Artist. In the following query I was expecting to see the genres of Billie Holiday ( the listing, but they are missing. PREFIX rdf: PREFIX dbpedia: SELECT ?Artist ?Genre WHERE { ?Artist rdf:type dbpedia:MusicalArtist; dbpedia:genre ?Genre . } ORDER BY ?Artist OFFSET 4425 LIMIT 1000 If I increase the limit it should return more solutions per query but overall it should return the same solutions. With a limit of 5000 the genres of Billie Holiday show up but at a higher offset that I would expect PREFIX rdf: PREFIX dbpedia: SELECT ?Artist ?Genre WHERE { ?Artist rdf:type dbpedia:MusicalArtist; dbpedia:genre ?Genre . } ORDER BY ?Artist OFFSET 5725 LIMIT 2000 Somehow the size of LIMIT seams to be influencing the outcome of these queries. If it is just a problem with my SPARQL queries I would appreciate if some could point it out to me. Thanks in advance for you help" "DBpedia URL Shortener and persistence" "u(at the bottom) has two shortened URLs that are only 2 years old, and are now broken. Kingsley, how long are shortened URLs supposed to persist? uOn 3/23/17 6:20 AM, Vladimir Alexiev wrote: Vladimir, The URL shortener wasn't supposed to be part of DBpedia. That happened by accident. They are broken because the DBpedia database was replaced when the most recent static edition was released. This functionality is volatile and not recommended beyond the span of a static DBpedia edition, especially while no commitment has been made re., long term maintenance. A long-term shortening service requires a dedicated instance. Kingsley" "what areas of knowledge has the best qualityand coverage in dbpedia?" "uI'd say that things are represented well in Dbpedia when the things are objects that have well defined properties. For instance, if I show up at the courthouse with a birth certificate that documents my date and place of birth and my parents, that proves that I'm a particular Person. Someday I'll have a death certificate with a date, place and cause of death and in between there may be records about jobs I held, things I wrote, performances I was in and so forth. People who make their mark with creative activities, for instance, I think DBpedia well represents the creative output of somebody like On the other hand if somebody is a cop/bureaucrat it is harder to document their career. For instance, it wouldn't be so clear that his influence was orders of magnitude greater than any of these people I suppose you could add properties for \"?s arrested ?o .\" and \"?s did surveillance on ?o .\" and \"?s intimidated ?o .\" but that is not there now. You can also do a good job with chemicals, automobiles, airplane models and things like that. On the other hand, consider topics like Even though these are all activities that people do, they don't have common properties in infoboxes. I guess they could, because people spend a certain amount of money on these things a year, a certain number of people are interested, all three of them are things people can do as a hobby but you can't (by law) make money doing amateur radio (unless you're a teacher helping a class talk to astronauts on the ISS or that you can occasionally sell used gear on a swap net.) uOk, this far I understand. The topics that have infoboxes will be represented better in DBpedia, that's clear. I have a different question: which topics with infoboxes are represented better than other topics with infoboxes? Is there a rating or leaderboard of the topics? uHi Yury, this page [1] will show you which is the current status of the en template mappings. There are stats about the percentage of mapped properties. Hope this helps. Cheers Andrea [1] 2013/10/1 Yury Katkov < > uCool, thanks!" "DBpedia Class Hierarchy" "uHi Tom, btw, we have agree on a naming schema for the DBpedia classes. It's important to have URIs that deal with ambiguity. The Yago approach is to use only one meaning (Wordnet's most frequent), but if we try to build a class hierarchy we need a URI for each meaning. IMO there are to different approaches: 1. use Wordnet IDs 2. use Wikipedia URIs So to diffentiate the meaning of \"bank\" for example, using Wordnet: - \"bank_108420278\" for the financial institution - \"bank_109213565\" for sloping land (especially the slope beside a body of water) using Wikipedia (see - \"Bank\" for the financial institution - \"Bank_(XYZ)\" for other meanings The Wikipedia approach is much more intuitive, but Wordnet is already represented in RDFS, which will be a hard task to do with Wikipedia names. Cheers, Georgi uHi All, We need to retain the benefits of WordNet synsets, focus on data conversions to RDF, and use what already exists. The advantage of the SKOS prefLable and altLabel is that, in the end, in really doesn't matter. Your \"glad\" may be my \"happy\" so why fight about it? We just need to adopt a standard reference and use it. So, we use WordNet for prefLabel, have a mapping mechanism via altLabel and WordNet synsets, let's move on. We have to adopt a canonical reference somewhere, and unfortunately, Wikipedia has no such synsets. Using WordNet synsets, we can still have as much richness as anyone could desire, even if a particular term may not be someone's favorite. I suspect the fastest way to stall out the Linked Data initiative is to get sucked into the endless vortex of terminology and epistemology arguments. Mike Georgi Kobilarov wrote: uGeorgi Kobilarov wrote: For books, there are ISBN which are far from perfect(*), but still a lot more precise than just the title of a book. Wikipedia has articles for some books, and the book title is used as the URL and article title. But the ISBN is found inside the article and can be dug out from the public database dumps. Germany's national library assigns a unique ID number to each author, known as the PND (personennamendatei = authority file). The German Wikipedia has articles about many authors and their names are the URL and article title. But the PNDs are found inside the article text, So, identifiers can be tied to Wikipedia articles even if they aren't used as article titles. In a similar fashion, it is technically possible to embed the Wordnet identifiers inside the text of Wikipedia (and Wiktionary) articles. Whether that is a good idea that will get accepted by the community, depends on how useful an application you can design from this. (*) Old books don't have ISBNs. One book may have several ISBNs, one for each edition. The list of limitations goes on and on." "Regarding category pages in DBpedia - How to get total number of pages in a category" "uHi all, I wanted to ask if it is possible to get the exact number of articles that are present in a category in Wikipedia . For. eg : this shows that it has 7355 articles. The corresponding category page in dbpedia : articles in it. So, is it possible for me to get this number '7355' using dbpedia category uHi Somesh, On 06/15/2012 07:50 AM, Somesh Jain wrote: The following sparql query does what you want: SELECT count(?s) WHERE { ?s ?p ?o.filter(?p = && ?o = ) }" "Best local DBpedia live mirror‏ setup?" "uHi Emery, This was more a time to reply issue than a list issue:) I think Option 1 is the easiest for you. We had a few issues since February when we had to change the communication API with Wikipedia and a few bugs came up but we should be fine now. There is only a reported bug with the database that Openlink is looking atm but might not apply to all cases Thanks for your interest & let us know if you face any other issues Best, Dimitris On Tue, Oct 18, 2016 at 8:39 PM, Emery Mersich < > wrote:" "How do I run the tests?" "uSorry for the stupid question, but I'm trying to add support for Polish dates (untested patches attached), and can't find anything about running surefire with scala. Also; is it ok to add periods in the era regex? I was kind of expecting to see the German \"v. Chr.\" in there. uHi Jimmy, our test framework has changed and we are not using the XML files anymore. We are now using ScalaTest, and for this specific case, the DateTimeParserTest. It is easiest to run them from an IDE, for example Intellij. Thanks for the patches. We will check in the Homepage patch and the DateTime patch. For the era parsers, the current solution does not support clean customization of languages and should be re-organized before adding new language support. We will look into that in future releases. Best, Robert & Max 2010/6/25 Jimmy O'Regan < >: u[Sorry, forgot reply all first time] On 7 July 2010 12:45, Jimmy O'Regan < > wrote: Also, I created an account (User:Jimregan) on mappings.dbpedia.org - can I get permission to edit? (I swear I'm not a bot :) uOn Wed, Jul 7, 2010 at 2:07 PM, Jimmy O'Regan < > wrote: I take you word for it. You now have editor rights. max" "Odd foaf:name values" "uI have noticed that the files \"Mappingbased literals\" provided for download here contain rather odd values, and rather a lot of those. For example for English there are the following entries for Hitchcock: < where \"(KBE\" as a name looks odd to me. or < where both the name in parentheses and the \"(DBE)\" looks odd to me. Other literals contain several entries or additional descriptive text, e.g. \"Laqab: Diya ad-Din (shortly), Adud ad-Dawlah\"@en . \"Kunya: Abu Shuja\"@en . \"Given name: Muhammad\"@en . \"Turkic nickname: Alp Arslan\"@en . \"Nasab: Alp ArslanibnChaghri-Beg ibnMikailibnSeljuqibnDuqaq\"@en . Sometimes the language id for the literal appears to be incorrect, e.g. \"Athens\"@en . \"Αθήνα\"@en . In the German language file, some entries are still worse, e.g.: \"Akeleien\"@de . \"Hahnenfußgewächse\"@de . \"Hahnenfußartige\"@de . \"Eudikotyledonen\"@de . \"Gattung\"@de . \"Tribus\"@de . \"Unterfamilie\"@de . \"Familie\"@de . \"Ordnung\"@de . \"ohne\"@de . \"nein\"@de . \"L.\"@de . Clearly, only the first of these triples is correct. The template for this entry is not filled incorrectly or otherwise broken though: {{Taxobox | Taxon_Name = Akeleien | Taxon_WissName = Aquilegia | Taxon_Rang = Gattung | Taxon_Autor = [[Carl von Linné|L.]] | Taxon2_LinkName = nein | Taxon2_WissName = Isopyreae | Taxon2_Rang = Tribus | Taxon3_WissName = Isopyroideae | Taxon3_Rang = Unterfamilie | Taxon4_Name = Hahnenfußgewächse | Taxon4_WissName = Ranunculaceae | Taxon4_Rang = Familie | Taxon5_Name = Hahnenfußartige | Taxon5_WissName = Ranunculales | Taxon5_Rang = Ordnung | Taxon6_Name = Eudikotyledonen | Taxon6_Rang = ohne | Bild = Aquilegia ottonis amaliae2UME.jpg | Bildbeschreibung = [[Balkanische Akelei]] (''[[Aquilegia ottonis]]'' subsp. ''amaliae'') }} I do not know too much about how the triples get extracted from the original Wikimedia text or templates, but it seems that the extraction is too lenient instead of too strict: personally I would rather have no entries for those wrong ones. Although the semantics of foaf:name are pretty loose, I do not think that for organisms, all the higher level taxon names should be seen as the name of that organism, clearly \"Eukaryote\" is not a name of humans even though humans belong to this group. That the names of the taxon group is also included (e.g. the value for Taxon4_Rang) appears to be a bug. These are all examples only, but there are many more entries where the same errors appear in a systematic way. Is there any way to accomplish an extraction of the triples with more importance of precision than recall? I have noticed that the files 'Mappingbased literals' provided for download here   recall? uIn the case of the first two, (KBE) and (DBE) are abbreviations of chivalric titles which aren't that different from putting PhD or MD after the name, just I think it's more exclusive. You could make the case that \"Sir Alfred Hitchcock (KBE)\" is a valid name, but KBE itself is not. \"Lady Mallowan\" is a totally appropriate name for Agatha Christie because she was married to Max Mallowan who was himself a knight before she became a Dame. In general when you are harvesting names like this there is the problem of finding the valid variant forms and not finding invalid forms. There is also the issue that you should be more liberal about what you recognize than what you generate. For instance you can find some racist slurs in Wikipedia redirects which are names you probably should recognize if somebody tries to use them but that will probably cause you trouble if you try to use them. uHi Johann, I just had a closer look at the German example. In this case the foaf:name property is used very liberally as a mapping: {{PropertyMapping | templateProperty = Taxon_Name | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon2_Name | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon3_Name | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon4_Name | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon5_Name | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon6_Name | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Bild | ontologyProperty = foaf:depiction }} {{PropertyMapping | templateProperty = Bildbeschreibung | ontologyProperty = depictionDescription }} {{PropertyMapping | templateProperty = Taxon_WissName | ontologyProperty = scientificName }} {{PropertyMapping | templateProperty = Taxon2_WissName | ontologyProperty = scientificName }} {{PropertyMapping | templateProperty = Taxon3_WissName | ontologyProperty = scientificName }} {{PropertyMapping | templateProperty = Taxon4_WissName | ontologyProperty = scientificName }} {{PropertyMapping | templateProperty = Taxon5_WissName | ontologyProperty = scientificName }} {{PropertyMapping | templateProperty = Taxon6_WissName | ontologyProperty = scientificName }} {{PropertyMapping | templateProperty = Taxon_Rang | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon2_Rang | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon3_Rang | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon4_Rang | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon5_Rang | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon6_Rang | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon2_LinkName | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon3_LinkName | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon4_LinkName | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon5_LinkName | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon6_LinkName | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon_Autor | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon2_Autor | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon3_Autor | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon4_Autor | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Taxon5_Autor | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Modus | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = ErdzeitalterVon | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Fundorte | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = MioVon | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = MioBis | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Subtaxa | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Subtaxa_Rang | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = ErdzeitalterBis | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Subtaxa_Plural | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Rangunterdrückung | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = TausendBis | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = TausendVon | ontologyProperty = foaf:name }} {{PropertyMapping | templateProperty = Name | ontologyProperty = foaf:name }} This causes this behavior, obviously. Please feel free to change this mapping to a more precise depiction of Taxobox template in DBpedia. I can see that a lot of valid data is neglected (or passed with foaf:name) which could be useful for the community, when portrayed with fitting properties. Best, Markus Freudenberg Release Manager, DBpedia On Tue, Jan 17, 2017 at 7:58 PM, Paul Houle < > wrote:" "Dbpedia Ontology" "uHi all, Could anybody please tell me when a change in the dbpedia ontology at mappings.dbpedia.org is merged into dbpedia, respectively accessible via the sparql endpoints? Both live.dbpedia.org/sparql and dbpedia.org/sparql do not show my changes made over a week ago? Cheers, Daniel Hi all, Could anybody please tell me when a change in the dbpedia ontology at mappings.dbpedia.org is merged into dbpedia, respectively accessible via the sparql endpoints? Both live.dbpedia.org/sparql and dbpedia.org/sparql do not show my changes made over a week ago? Cheers, Daniel uOn 5/28/11 10:55 AM, Gerber Daniel wrote: To save time, just respond with a linkset dump URL. Kingsley uActually, this should be the query for it: PREFIX meta: CONSTRUCT {?s ?p ?o} FROM WHERE { ?b meta:origin meta:TBoxExtractor . ?b owl:annotatedSource ?s . ?b owl:annotatedProperty ?p . ?b owl:annotatedTarget ?o . FILTER(!(?p IN ( meta:editlink, meta:revisionlink, meta:oaiidentifier, ))). } And this for one class: PREFIX meta: CONSTRUCT {?s ?p ?o} FROM WHERE { ?b meta:sourcepage . ?b owl:annotatedSource ?s . ?b owl:annotatedProperty ?p . ?b owl:annotatedTarget ?o . FILTER(!(?p IN ( meta:editlink, meta:revisionlink, meta:oaiidentifier, meta:sourcepage, ))). } Normally this endpoint would work: But things are changing a lot lately. I think there must be a way to get it directly from the Wiki alsoSebastian On 28.05.2011 16:55, Gerber Daniel wrote: uThanks! But the reason for changing the ontology was to execute this query and get more results, especially for p2 since domain and range are bound. \"PREFIX rdf: \" + \"PREFIX rdfs: \" + \"SELECT ?s2 ?s2l ?p2 ?o2 ?o2l ?rangep2 ?domainp2 \" + \"WHERE {\" + \" ?s2 rdf:type .\" + \" ?s2 rdfs:label ?s2l . \" + \" ?o2 ?p2 ?s2 . \" + \" ?o2 rdfs:label ?o2l \" + \" FILTER (lang(?s2l) = \\"en\\") \" + \" FILTER (lang(?o2l) = \\"en\\") \" + \" ?p2 rdfs:range ?rangep2 . \" + \" ?p2 rdfs:domain ?domainp2 . \" + \"}\"; I've tried this endpoint ( Any ideas on how to make that query? Cheers, Daniel On 29.05.2011, at 08:47, Sebastian Hellmann wrote:" "How to extract Slovenian Category labels and skos?" "uHello! I have local instalation of DBpedia extraction and I would like to extract Slovenian category labels, SKOS and article categories. For now I have commented line 14 in CategoryLabelExtractor.scala 14 // require(Set(\"en\").contains(language)) And i get labels but the problem is there are duplications: < The reason seems to be that both \"letala\" and \"zrakoplovi\" link to the same english category. What is the right way to make extraction of categories possible. Is replacing line 14 with: \"require(Set(\"en\", \"sl\").contains(language))\" enough? Is it possible to extract categories with categories which exist only in Slovenian wikipedia. I know that official DBpedia extracts only articles that have English page, but can I change something in extraction framework to base extraction on Slovenian titles or is this a bigger and deeper change? Regards, Marko Burjek Hello! I have local instalation of DBpedia extraction and I would like to extract Slovenian category labels, SKOS and article categories. For now I have commented line 14 in  CategoryLabelExtractor.scala 14 // require(Set('en').contains(language)) And i get labels but the problem is there are duplications: < Burjek uHi, you can checkout the \"Greece\" branch from the framework, it deals with internationalizations issues such as yours. We implemented it for the creation of the Greek DBpedia but is configurable for other languages as well It is not documented yet but you can find some configurations options in the following files: org.dbpedia.extraction.ontology.OntologyNamespaces.scala val specificLanguageDomain = Set(\"el\", \"de\", \"it\") val encodeAsIRI = Set(\"el\", \"de\") org.dbpedia.extraction.mappings.extractor.scala private def retrieveTitle(page : PageNode) : Option[WikiTitle] = On Wed, Mar 30, 2011 at 12:41 AM, Marko Burjek < >wrote: uHi Marko, maybe you could sign up on the dbpedia developers list: and give us feedback, whether you managed to create a slovenian dump. If you have some additions/fixes/extensions/tests to the code base, please tell me. I could give you write access to the Mercurial to commit you code. Sebastian On 30.03.2011 10:11, Dimitris Kontokostas wrote:" "Finding information on dbpedia" "uIf I have to fine some information about Physics course like Scalar products, vector products, what is momentum, torque etc, how can I get this information from dbpedia. Most of these information are available on wikipedia but I could not find it on dbpedia. If I have to fine some information about Physics course like Scalar products, vector products, what is momentum, torque etc, how can I get this information from dbpedia. Most of these information are available on wikipedia but I could not find it on dbpedia." "Problems accessing to ontology classes' properties or testing the ontology mappings" "uHi all ; We are trying to complete infobox mappings for Turkish language. But since friday when we try to open properties of an Ontology Class or try to test a mapping which is done before , we are getting an error message that \" Service Temporarily UnavailableThe server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later. \" Is there any maintenance at dbpedia now that is why I am getting this error message or there is another problem about that. Thanks ; Sinan KÖRDEMÝR Hi all ; We are trying to complete infobox mappings for Turkish language. But since friday when we try to open properties of an Ontology Class or try to test a mapping which is done before , we are getting an error message that ' Service Temporarily Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later. ' Is there any maintenance at dbpedia now that is why I am getting this error message or there is another problem about that. Thanks ; Sinan KÖRDEMÝR" "Question about missing dependencies in DBpedia Extraction Framework" "uDear All, While I'm trying to install DEF in my machine (Linux unbuntu) and using intelliJ IDEA, I have clone the DEF from this link: git clone git://github.com/dbpedia/extraction-framework.githowever there are some missing dependencies in Maven Project in \"Live Extraction\" project and when I run clean and install which recommended from this link: PS: I have user git clone git://github.com/dbpedia/extraction-framework.git in stead of Mercurial repository URL: I got the following error: INFO] Building Live extraction [INFO] task-segment: [install] [INFO] uThe aksw repository seems to have temporary problems. You can comment out the live module, you probably don't need it. While you're at it, you can also comment out server, scripts and wiktionary to speed up the build. In the parent pom.xml: core" "sameAs links to Freebase" "uHi everyone, In order to improve the Italian DBpedia interlinkage, it would be great to add sameAs links to Freebase. How did you guys do that on the English version? I found a couple of posts in the Freebase wiki [1] and DBpedia blog [2] saying it was done some years ago, but I can't find more details. Could you please tell me more? Thanks! Cheers, Marco uForgot the links [1] [2] On 10/17/12 1:30 PM, Marco Fossati wrote: uHi Marco, In DBpedia Greek we created a script [1] where we created these links transitively from the English datasets. Note that it needs a few modifications to work with the new directory structure. If you find it handy and adapt it you can contribute it back for the other chapters. Best, Dimitris [1] On Wed, Oct 17, 2012 at 2:37 PM, Marco Fossati < > wrote: uHi Marco, Can also be obtained via Wikipedia links in the Freebase Quad Dump: /m/0p_47 /type/object/key /wikipedia/en Steve_Martin /m/0p_47 /type/object/key /wikipedia/de_id 107261 Would be sweet to have a Scala object that blazes through this dump and outputs the sameAs info. Cheers, Pablo On Wed, Oct 17, 2012 at 2:04 PM, Dimitris Kontokostas < >wrote: uHi Dimitris, Thanks for pointing me to your script. It will be useful for creating the links from the English data. However, I am wondering how to generate Freebase links for those lanugage-specific resources that do not have their counterpart in the English chapter. See for instance the Italian Wikipedia article for Maurizio Crozza [1]. There is no English counterpart, hence no data in the English DBpedia, but only in the Italian one [3]. In Freebase, there is indeed some [4]. So, the core of my question is: how to generate Freebase links from scratch? Since there is no real RDF Freebase dump, I assume I cannot use a tool like Silk for the linking task. Cheers, Marco [1] [2] [3] [4] On 10/17/12 2:04 PM, Dimitris Kontokostas wrote: uHi Pablo, On 10/17/12 3:20 PM, Pablo N. Mendes wrote: Yep, this solution can perfectly cover all the Freebase resources that have a '/wikipedia/{LANUGAGE}' or '/wikipedia/{LANGUAGE}_id' key. Do you know if it is possible to generate links to resources that have no such keys? For instance, Maurizio Crozza has only keys from MusicBrainz [1]. Does the English chapter consider those cases? Cheers, Marco [1] uThe Freebase dump includes direct links for topics which have interwiki links: $ bzgrep /wikipedia/it freebase-datadump-quadruples.tsv.bz2 | head /m/010pld /type/object/key /wikipedia/it Yorktown_$0028Virginia$0029 /m/010pld /type/object/key /wikipedia/it_id 1071928 /m/010pld /type/object/key /wikipedia/it_title Yorktown_$0028Virginia$0029 /m/010vmb /type/object/key /wikipedia/it Walla_Walla /m/010vmb /type/object/key /wikipedia/it_id 2353022 /m/010vmb /type/object/key /wikipedia/it_title Walla_Walla /m/011f6 /type/object/key /wikipedia/it Aerodinamica /m/011f6 /type/object/key /wikipedia/it_id 47589 /m/011f6 /type/object/key /wikipedia/it_title Aerodinamica /m/011m0q /type/object/key /wikipedia/it_id 1633330 On Wed, Oct 17, 2012 at 9:30 AM, Marco Fossati < > wrote: Practically the only way to make the link for Maurizio Crozza is manually, because there's no additional information to match on. If Wikipedia had a link to MusicBrainz or Freebase had one of his movies and thus a link to IMDB, the task might be possible to automate, but with the information that's available it's basically an impossible task. One might imagine a super sophisticated matching tool which was able Gialappa's Band references from the WP article and the MusicBrainz album title, but I think that's a real stretch. It's not the data format that's the problem. You can't use an automated guessing tool like Silk unless it's got at least a little bit of information to base its guesses on. For topics which *do* have other strong identifiers (IMDB, MusicBrainz, Library of Congress, VIAF, etc) or additional information (e.g. birth date) it would be possible to do some automatic matching, but I don't how many of your articles fall into that category. Tom uHi Tom, On Wed, Oct 17, 2012 at 5:10 PM, Tom Morris < > wrote: Do you copy the interwiki links as-is or do you post process them? We calculated that more than 90% of interlinking errors come from 1-way links and thus, we use only the 2-way links for owl:sameAs [1] Best, Dimitris [1] uOn Wed, Oct 17, 2012 at 10:23 AM, Dimitris Kontokostas < >wrote: I don't do anything. You'd have to ask the Freebase team what links they use (although it should be pretty easy to figure out by comparing the Freebase dump with the interwiki links). $40? I'm sure it's a wonderful paper, but I think I'll wait for the movie version. Tom uOn Wed, Oct 17, 2012 at 5:29 PM, Tom Morris < > wrote: I thought you were in the team, my mistake This is the free link ;) uHi Tom, I checked the whole Freebase dump looking for topics that have an Italian Wikipedia id AND NOT an English one. It seems there is no one. Since the English links are already created and published, I conclude there is no need to generate new links from scratch (at least for the Italian chapter). WRT Italian Wikipedia articles that have no counterparts in the English version, I will see whether it is worth to automate the task or not. Thank you all for the advices! Cheers, Marco On 10/17/12 4:10 PM, Tom Morris wrote: uOn Wed, Oct 17, 2012 at 12:48 PM, Marco Fossati < >wrote: That sounds correct. As far as Wikipedia is concerned, Freebase only imports from English Wikipedia, so that forms the core. Tom On Wed, Oct 17, 2012 at 12:48 PM, Marco Fossati < > wrote: I checked the whole Freebase dump looking for topics that have an Italian Wikipedia id AND NOT an English one. It seems there is no one. That sounds correct.  As far as Wikipedia is concerned, Freebase only imports from English Wikipedia, so that forms the core. Tom Since the English links are already created and published, I conclude there is no need to generate new links from scratch (at least for the Italian chapter). WRT Italian Wikipedia articles that have no counterparts in the English version, I will see whether it is worth to automate the task or not. Thank you all for the advices! Cheers, Marco On 10/17/12 4:10 PM, Tom Morris wrote: The Freebase dump includes direct links for topics which have interwiki links: $ bzgrep /wikipedia/it freebase-datadump-quadruples. tsv.bz2 | head /m/010pld       /type/object/key        /wikipedia/it Yorktown_$0028Virginia$0029 /m/010pld       /type/object/key        /wikipedia/it_id        1071928 /m/010pld       /type/object/key        /wikipedia/it_title Yorktown_$0028Virginia$0029 /m/010vmb       /type/object/key        /wikipedia/it   Walla_Walla /m/010vmb       /type/object/key        /wikipedia/it_id        2353022 /m/010vmb       /type/object/key        /wikipedia/it_title     Walla_Walla /m/011f6        /type/object/key        /wikipedia/it   Aerodinamica /m/011f6        /type/object/key        /wikipedia/it_id        47589 /m/011f6        /type/object/key        /wikipedia/it_title     Aerodinamica /m/011m0q       /type/object/key        /wikipedia/it_id        1633330 On Wed, Oct 17, 2012 at 9:30 AM, Marco Fossati < > wrote: However, I am wondering how to generate Freebase links for those lanugage-specific resources that do not have their counterpart in the English chapter. See for instance the Italian Wikipedia article for Maurizio Crozza [1]. There is no English counterpart, hence no data in the English DBpedia, but only in the Italian one [3]. In Freebase, there is indeed some [4]. So, the core of my question is: how to generate Freebase links from scratch? Practically the only way to make the link for Maurizio Crozza is manually, because there's no additional information to match on.  If Wikipedia had a link to MusicBrainz or Freebase had one of his movies and thus a link to IMDB, the task might be possible to automate, but with the information that's available it's basically an impossible task. One might imagine a super sophisticated matching tool which was able Gialappa's Band < stretch. uOn Wed, Oct 17, 2012 at 3:20 PM, Pablo N. Mendes < > wrote: Here you go: The script currently only looks for wikipedia/en links, but it's trivial to change that to another language (and very simple to add a parameter for the language). The script needs the DBpedia datasets Labels, Redirects and DisambiguationLinks. Cheers, JC uI'll just add that there are two ways to reconcile DBpedia and Freebase and they don't ~quite~ give the same answers. (1) you can use the /type/object/key with /wikipedia/(en|de) and match that up with the dbpedia identifier, or (2) you can use the numeric \"en_id\", \"de_id\" and match that up with the number from dbpedia Note you can generate a linkage file for (1) based entirely on the quad dump, but to do (2) you need the page id's from DBpedia. I've tried both of these and decided to use (2) based on a very superficial analysis for Ookaboo. All I really know is that the results from (1) and (2) aren't quite the same." "no results from dbpedia when accessed thorugh script whereas trough browser the results are available." "uDear all, I am facing a problem and hope smeone will help me in solving this. In one of the prototypes I am involved where I am accessing dbpedia using jquery by forming ajax query that picks some keywords form an HTML page and builds the query and sends it to dbpedia. The resulted data in json form is parsed and presented in tables. This was working last year (apx 15months before). Now again the script is not able to bring any data whereas the query url if accessed through a web-browser a JSON file is available for download. What could be reasons for this? Is there any modifications dbpedia made for script based access? uHi Vinit, One of my tools is also querying the DBpedia endpoint through AJAX, and I can confirm it's been working fine the past 15 months, or so, on my end. Have you checked the console log? Also, have you tried dumping the response, i.e. the JSON object (if any) to the console (e.g. console.log(data))? One thing I know for sure is that Virtuoso 7, for some reason, doesn't retrieve the results in the same order as Virtuoso 6, so that could be an issue, if DBpedia has migrated to V7 in the past year, and you're expecting a result in a certain format (so you can parse it). I don't know if that is the case, though. Best, Uro¹ From: Vinit Kumar [mailto: ] Sent: Tuesday, August 26, 2014 6:33 PM To: Subject: [Dbpedia-discussion] no results from dbpedia when accessed thorugh script whereas trough browser the results are available. Dear all, I am facing a problem and hope smeone will help me in solving this. In one of the prototypes I am involved where I am accessing dbpedia using jquery by forming ajax query that picks some keywords form an HTML page and builds the query and sends it to dbpedia. The resulted data in json form is parsed and presented in tables. This was working last year (apx 15months before). Now again the script is not able to bring any data whereas the query url if accessed through a web-browser a JSON file is available for download. What could be reasons for this? Is there any modifications dbpedia made for script based access? uHi Vinit, it works for me, try this (using jquery) $.ajax({url:' ',success:function(data){console.info(data)}}) you will get a valid response 1. Accept-Ranges: bytes 2. Access-Control-Allow-Origin: * 3. Access-Control-Expose-Headers: Content-Type 4. Connection: keep-alive 5. Content-Length: 12395136 6. Content-Type: application/sparql-results+json 7. Date: Tue, 26 Aug 2014 17:35:28 GMT 8. Server: Virtuoso/07.10.3211 (Linux) x86_64-redhat-linux-gnu VDB 2014-08-26 18:33 GMT+02:00 Vinit Kumar < >:" "DBpedia down?" "uHello! I keep getting this error when accessing dbpedia today: Error 08001 HT016: Cannot connect to localhost:8080 in http_get in http_proxy:(BIF), WS.WS.DEFAULT:44, Cheers! y uHi, Yves uHello! uHello! Since this morning, I am receiving a very large amounts of time outs from dbpedia.org - is that a know issue? Best, y uHI Yves, Ddbpedia.org was swapping due to huge number of requests. I restarted the server. Patrick uHiya guys, Have been scraping some data from dbpedia but am now getting 504 errors on resource pages: Just wondering if your aware of this issue. Many thanks, Matt Hiya guys, Have been scraping some data from dbpedia but am now getting 504 errors on resource pages: Matt u uHI Matthew, The database needed a restart earlier today. Can you check and let me know if this issue is now resolved please? Patrick uHello! It is looking like DBpedia is suffering some issues this morning: \"CL: Cluster could not connect to host 2 22202 error 111\" Best, y uOn 1/16/13 8:32 AM, Yves Raimond wrote:" "DBPedia Lookup - We will fix the problem" "uHi Rohana, we are currently reorganizing our sever infrastructure and are also working on updating the index that is used by the lookup service. Thus, please expect the lookup service to be online again in about 2 weeks. Best, Chris Von: Rohana Rajapakse [mailto: ] Gesendet: Montag, 17. Januar 2011 11:02 An: Betreff: [Dbpedia-discussion] DBPedia Lookup Hi, The DBPediaLookUp (lookup.dbpedia.org) service is down for sometime now. Any idea why? And when will it be up and running again? Rohana Rajapakse Senior Software Developer GOSS Interactive t: +44 (0)844 880 3637 f: +44 (0)844 880 3638 e: w: www.gossinteractive.com" "DBpedia Quality Improvement: Please reportbugs!" "uHi Tom, Thanks for your message. Your example the Wikipedia-URI of Trust_(social_sciences) was Trust_(sociology) in February when we picked up the Wikipedia dump. We do need (but don't have yet) a mechanism to construct redirects in DBpedia for such changes, so links to DBpedia URIs will not end as dead links. We will most likely switch to unencoded URIs with our next release because Apache does funny things (well, not funny if you try to debug it, Richard) with encoded URIs. So You find it selecting \"virtue\" or \"issue\" from the cloud. Well, maybe some work to do on classification?! ;) The current version of DBpedia Search uses incoming links as one metric to compute the rank of results. The next version will use a PageRank variant, where the rank of linking articles is also included. These lists of articles (for example List of National Trust properties in England ) get high ranks this way, I don't know for sure how to handle it yet. Next version will also solve this annoying performance;) :) Cheers, Georgi uHi Tom, so, I discussed that with Richard and we will stay with encoded URIs, because Wikipedia use encoded links in it's HTML. For example, see links in HTML of Our reason to consider switching to unencoded URIs was that bug with the Apache Proxy, but that's fixed. Cheers Georgi uHi Georgi, Brilliant, and thanks for the quick response. Encoded URIs it is! :) Tom. On 27/06/07, Georgi Kobilarov < > wrote:" "Resources and their classes" "uHello everyone, i'm working on a project tat involves the use of the classes of dbpedia ontology and YAGO. Analizyng some resources there are examples that doesn't match to me, so I'm asking for a bit of explanation about them. Exmple 1: This example has both dbpedia-owl classes and ago classes referenced by a rdf:type property rdf:type dbpedia-owl:Place dbpedia-owl:Area dbpedia-owl:Resource dbpedia-owl:PopulatedPlace yago-class:AncientGreekCities yago-class:CitiesAndTownsInApulia yago-class:CoastalCitiesAndTownsInItaly yago-class:PortCitiesInItaly Example 2: This example has no YAGO class, no rdf:type, but has this property: dbpprop:type dbpedia:Pirate Why it has no rdf:type ? what is this propery (dbpprop:type) used for? Why his type is not a class in the dbpedia ontology? Example 3: This example has no YAGO class, a rdf:type along with a new dbpedia- owl:type: rdf:type dbpedia-owl:Company dbpedia-owl:Resource dbpedia-owl:Organisation dbpedia-owl:type dbpedia:Private_security dbpedia:Pirate Also, now dbpedia:Pirate is referenzed by a dbpedia-owl property while before it was referenced by a dbpprop property. Example 4: this has no rdf:type, no YAGO class, but both dbpedia-owl:type and dbpprop:type (both inverse) is dbpedia-owl:type of dbpedia:Anti_Piracy_Maritime_Security_Solutions is dbpprop:type of dbpedia:Blackbeard dbpedia:Edward_Low dbpedia:Anne_Bonny [] Example 5: Ths has no rdf:type, no YAGO class, the only \"ontology related\" properties in it are: is dbpedia-owl:genre of dbpedia:Shadow_Kiss dbpedia:Blood_Promise_%28novel%29 dbpedia:Frostbite_%28Richelle_Mead_novel%29 dbpedia:Vampire_Academy is dbpedia-owl:subject of dbpedia:The_Vampire_Lestat dbpedia:Interview_with_the_Vampire dbpedia:Blackwood_Farm Now can someone please explain why there are so different structures in those resources? One can't just pick a resource and get his class in the ontology? Please clear my mind :) Thank you in advice, Piero Hello everyone, i'm working on a project tat involves the use of the classes of dbpedia ontology and YAGO. Analizyng some resources there are examples that doesn't match to me, so I'm asking for a bit of explanation about them. Exmple 1: Piero uPiero, Short answer: Ignore dbprop:type and dbpedia-owl:type, use only rdf:type. Long answer: dbprop:type and dbpedia-owl:type are generated from Wikipedia template properties with similar names and are sometimes confusing. rdf:type properties are more precise - they are only generated if the Wikipedia page uses a template for which we have a mapping to our ontology. Concerning your examples: for which we have a mapping to our ontology, so we generate an rdf:type property with the value dbpedia-owl:PopulatedPlace and its base types. The wiki source code of contains the following template, which has a 'type' property: {{Infobox Pirate | type = [[Pirate]] }} We don't have a mapping from to our ontology (yet), so we generate no rdf:type and no dbpedia-owl:* properties, only generic dbprop:* properties, which are simply the template properties with minor improvements - in this case, we convert the Wiki link [[Pirate]] to the URI dbpedia:Pirate. Your next example is interesting - its Wiki source code contains the the following: {{Infobox Company | company_type = Anti-[[pirate|piracy]] [[private security]] }} We do have a mapping from to our ontology, so we generate the appropriate rdf:type property. The mapping also defines that the template property 'company_type' is mapped to the RDF property dbpedia-owl:type. The value for that property is extracted from the template property value, which contains a Wiki link to [[pirate]], which is (as before) converted to the URI dbpedia:Pirate. Maybe the parser shoudn't discard the prefix 'Anti'. :-) I hope that clears things up a bit Christopher On Tue, Sep 29, 2009 at 16:08, Piero Molino < > wrote: uPiero, Currently about one million resources have an rdf:type and dbpedia-owl:* properties. In other words, about one million (of 3.5 million) pages in the English Wikipedia use an Infobox template for which we have a mapping to the ontology. With the next DBpedia release (which should be ready in two weeks or so) the number will probably rise to about 1.1 million. Christopher On Wed, Sep 30, 2009 at 10:08, Piero Molino < > wrote:" "Announcing Virtuoso Open-Source Edition v 5.0.13" "uHi OpenLink Software is pleased to announce a new release of Virtuoso, Open-Source Edition, version 5.0.13. This version includes: * Database engine - Added configuration option BuffersAllocation - Added configuration option AsyncQueueMaxThreads - Added docbook-xsl-1.75.2 - Added RoundRobin connection support - Removed deprecated samples/demos - Fixed copyright and license clarification - Fixed use MD5 from OpenSSL when possible - Fixed issue with XA exception, double rollback, transact timeout - Fixed issue reading last chunk in http session - Fixed use pipeline client in crawler - Fixed accept different headers in pipeline request - Fixed do not post when no post parameters - Fixed checkpoint messages in log - Fixed read after allocated memory - Fixed shortened long URLs in the crawlers view to avoid UI breakage - Fixed building with external zlib - Removed support for deprecated JDK 1.0, 1.1 and 1.2 - Rebuilt JDBC drivers * SPARQL and RDF - Added initial support for SPARQL-FED - Added initial support for SERVICE { }; - Added support for expressions in LIMIT and OFFSET clauses - Added built-in predicate IsRef() - Added new error reporting for unsupported syntax - Added rdf box id only serialization; stays compatible with 5/6 - Added support for SPARQL INSERT DATA / DELETE DATA - Added support for HAVING in sparql - Added special optimizations for handling: SPARQL SELECT DISTINCT ?g WHERE { GRAPH ?g { ?s ?p ?o } } - Added support for HTML+RDFa representation re. SPARQL CONSTRUCT and DESCRIBE query results - Added support for output:maxrows - Updated ontologies API - Updated iSPARQL application - Fixed IRI parts syntax to match SPARQL 1.0 W3C recommendation - Fixed support for XMLLiteral - Fixed bad box flags for strings for bnodes and types - Fixed replace lost filters with equivs that have no spog vars and no \"good\" subequivs. - Fixed cnet doublt awol:content - Fixed Googlebase query results with multiple entries - Fixed Googlebase location info - Fixed default sitemap crawling functions/pages - Fixed use SPARUL LOAD instead of SOFT - Fixed make sure version is intact as changes to .ttl file must reflect in sparql.sql - Fixed missing qualification of aggregate - Fixed compilation of ORDER BY column_idz clause in iterator of sponge with loop - Fixed UNION of SELECTs and for multiple OPTIONALs at one level with \"good\" and \"bad\" equalities - Fixed support for define output:format \"JSON\" - Fixed crash of rfc1808_expand_uri on base without schema - Fixed redundant trailing '>' in results of TTL load when IRIs contain special chars - Fixed \"option (score )\" in a gp with multiple OPTIONAL {} - Fixed when different TZ is used, must find offset and transform via GMT - Fixed SPARQL parsing and SQL codegen for negative numbers - Fixed some 'exotic' cases of NT outputs * ODS Applications - Added support for ckeditor - Added new popup calendar based on OAT - Added VSP and REST implementation for user API - Added new API functions - Added FOAF+SSL groups - Added feed admin rights - Added Facebook registration and login - Removed support for Kupu editor - Removed support for rte editor - Removed support for IE 5 and 6 compatibility - Fixed users paths to physical location - Fixed problem with activity pages Other links: Virtuoso Open Source Edition: * Home Page: * Download Page: OpenLink Data Spaces: * Home Page: * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: OpenLink" "New DBpedia Overview Article Available" "uDear all, we are pleased to announce that a new overview article for DBpedia is available: The report covers several aspects of the DBpedia community project: * The DBpedia extraction framework. * The mappings wiki as the central structure for maintaining the community-curated DBpedia ontology. * Statistics on the multilingual support in DBpedia. * DBpedia live synchronisation with Wikipedia. * Statistics on the interlinking of DBpedia with other parts of the LOD cloud (incoming and outgoing links). * Several usage statistics: What kind of queries are asked against DBpedia and how did that change over the past years? How much traffic do the official static and live endpoint as well as the download server have? What are the most popular DBpedia datasets? * A description of use cases and applications of DBpedia in several areas (drop me mail if important applications are missing). * The relation of DBpedia to the YAGO, Freebase and WikiData projects. * Future challenges for the DBpedia project. After our ISWC 2009 paper on DBpedia, this is the (long overdue) new reference article for DBpedia, which should provide a good introduction to the project. We submitted the article as a system report to the Semantic Web journal, where it will be reviewed. Thanks a lot to all article contributors and to all DBpedia developers and users. Feel free to spread the information to interested groups and users. Kind regards, Jens" "extracting audios files and intermediate nodes" "uHi, I am extracting audio files for “Things” but I came across erroneous filenames for some entries. Fore example: SELECT ?audio WHERE { dbr:Korn dbp:filename ?audio } Returns: Korn - Predictable .ogg But the correct filename is: Korn - Predictable (demo).ogg After some investigation I found that SELECT ?audio WHERE { dbr:Korn1 dbo:filename ?audio } returns the right filename. Is this a bug or a feature? What is the recommended way to extract audio files? uHi Joakim, yes indeed contents inside () are stripped in some cases these triples come from different extractors and that is why there is a difference the triples with dbp:filename come from the raw infobox extractor that is known to be of low quality while the dbo:filename triples come from the mappings extractor that provides better quality data so in general you should use the latter The problem here is that the mappings extractor is used to map facts for the article while the {{listen}} template is most of the times only related to the article so, if there is no other mapped template in the page the page is mapped as Sound and when there are multiple templates we end up with many Korn1 Korn2 Korn3 Korn4 Since you obviously have a use case maybe you can tell us how it would be most convenient to provide such facts options a) continue as we do now b) for every listen template we create a new simple trlple like dbo:audioFile \"filename.ogg\" c) something more advanced that captures other metadata from the audio file Cheers, Dimitris On Thu, May 5, 2016 at 12:22 AM, Joakim Soderberg < > wrote: uThanks for the explanation Dimitri. My use case is to retrieve as many audio files as possible. The revised solution is: SELECT * WHERE { ?dbpediaId a dbo:Sound . { ?dbpediaId dbo:fileURL ?fileURL } UNION { ?dbpediaId dcterm:format ?format FILTER (regex(str(?format), 'x-midi|ogg'))} UNION { ?dbpediaId dbo:fileExtension ?extension FILTER (regex(str(?extension), '(mid|wav)')) } UNION { ?dbpediaId dbo:filename ?filename FILTER (regex(str(?filename), '.*\\\\.(ogg|WebM|mp3|wav|oga|m4a|flac)$')) } } Which returns ~ 12 000 file names. uThanks for the feedback Joackim You might also want to take a look at you can get all the media files in wikimedia commons which might be also useful for you On Thu, May 5, 2016 at 7:28 PM, Joakim Soderberg < > wrote:" "Support for 304 not modified" "uHi group, We consume a lot of information in the LOD cloud, unfortunately most LOD services (including dbpedia) we tried do not seem to support caching and thus do not return a \"304 not modified\" on a resource. This gives unnecessary load on the endpoint and also makes it much harder for consumers devices like smart phones to cache the data to reduce both battery consumption and network load on bad 3G links. This could easily be fixed in Pubby or whatever software is used in taking care of the \"If-Modified-Since\" request header. Any chance this comes on a todo list? :) thanks cu Adrian" "URI lookup" "uHey I was wondering what is the best way to get URI, just like This is too naive: PREFIX rdfs: SELECT ?x ?y WHERE { ?x rdfs:label ?y . FILTER regex(?y, \"%Kindle%\") } Thanks Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org Hey I was wondering what is the best way to get URI, just like SELECT ?x ?y WHERE { ?x rdfs:label ?y . FILTER regex(?y, '%Kindle%') } Thanks Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org ue.g. Hope that helps. Georgi uPerfect! :) What is QueryClass suppose to mean? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Fri, Nov 27, 2009 at 2:13 PM, Georgi Kobilarov < >wrote: uQueryClass was supposed to be a filter on rdfs:type, but I've never implemented it. Couldn't decide what to do about resources without rdfs:type. Cheers, Georgi From: Juan Sequeda [mailto: ] Sent: Friday, November 27, 2009 9:30 PM To: Georgi Kobilarov Cc: Subject: Re: [Dbpedia-discussion] URI lookup Perfect! :) What is QueryClass suppose to mean? Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Fri, Nov 27, 2009 at 2:13 PM, Georgi Kobilarov < > wrote: e.g. Hope that helps. Georgi umakes sense. Freebase has something like that too. Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org On Fri, Nov 27, 2009 at 5:49 PM, Georgi Kobilarov < >wrote: uIs there any way of getting these results as JSON? Thank you, Stephen Hatton Is there any way of getting these results as JSON? Hatton" "Instance types for languages where it's currently unavailable" "uHello, I'm working on a multilingual NLP tool that, among other things, uses dbpedia instance type data. However, I see that for certain smaller languages this data is not available. Can anyone elaborate on what it would take to extract and make this type of data available for a new language (e.g. my native Latvian), perhaps the required amount of work is somewhat feasible for me to do? Regards, Peteris uI would need editor rights to username there is peterisp. Can someone grant me these rights so I can try to input the mappings for Latvian language? Regards, Peteris On Mon, Feb 29, 2016 at 11:31 AM, Pēteris Paikens < > wrote:" "DBpedia to schema.org" "uHey all, the DBpedia ontology now provides mappings to schema.org vocabulary. There are 44 equivalent classes, 10 sub classes and 31 equivalent properties. This mappings can be edited via the mappings wiki. Because of the different detail level of schema.org and DBpedia ontology we only cover 54 of about 290 schema.org classes. For instance, schema.org has round about 120 classes for local businesses ( e.g. BikeStore, BookStore, ClothingStore, ComputerStore, Attorney, Dentist, Electrician, HousePainter, Locksmith, ). The DBpedia ontology doesn't provide such detailed classes, because Wikipedia hasn't articles for small local stores or services. I have attached lists of the mappings to this email. You are invited to discuss or edit the mappings. How to map schema.org classes/properties: The owl:equivalentClass property now contains a comma seperated list of equivalent classes. dbpedia-owl:Person for example: {{Class | rdfs: = person | owl:equivalentClass = foaf:Person, schema:Person }} Sub classes are defined by the rdfs:subClassOf property. Here, it is to consider that the order of given classes is important for the dbpedia ontology hierarchy. If a class is a sub class of another dbpedia class, the dbpedia class must be given first. dbpedia-owl:MusicFestival for example: {{Class | rdfs: = music festival | rdfs:subClassOf = Event, schema:Festival }} Equivalent properties are of course defined by owl:equivalentProperty. dbpedia-owl:foundedBy for example: {{ObjectProperty | rdfs: = founded by | rdfs:domain = owl:Thing | rdfs:range = owl:Thing | owl:equivalentProperty = schema:founders }} regards, Paul dbpedia-owl:Aircraft -> schema:Product dbpedia-owl:Automobile -> schema:Product dbpedia-owl:Band -> schema:MusicGroup dbpedia-owl:FilmFestival -> schema:Festival dbpedia-owl:Instrument -> schema:Product dbpedia-owl:Locomotive -> schema:Product dbpedia-owl:MusicalArtist -> schema:MusicGroup dbpedia-owl:MusicFestival -> schema:Festival dbpedia-owl:Ship -> schema:Product dbpedia-owl:Weapon -> schema:Product dbpedia-owl:Work = schema:CreativeWork dbpedia-owl:Website = schema:WebPage dbpedia-owl:University = schema:CollegeOrUniversity dbpedia-owl:TelevisionStation = schema:TelevisionStation dbpedia-owl:TelevisionEpisode = schema:TVEpisode dbpedia-owl:Stadium = schema:StadiumOrArena dbpedia-owl:SportsTeam = schema:SportsTeam dbpedia-owl:SportsEvent = schema:SportsEvent dbpedia-owl:Song = schema:MusicRecording dbpedia-owl:SkiArea = schema:SkiResort dbpedia-owl:Sculpture = schema:Sculpture dbpedia-owl:Restaurant = schema:Restaurant dbpedia-owl:River = schema:RiverBodyOfWater dbpedia-owl:School = schema:School dbpedia-owl:ShoppingMall = schema:ShoppingCenter dbpedia-owl:RadioStation = schema:RadioStation dbpedia-owl:Place = schema:Place dbpedia-owl:Library = schema:Library dbpedia-owl:Museum = schema:Museum dbpedia-owl:Painting = schema:Painting dbpedia-owl:Person = schema:Person dbpedia-owl:Park = schema:Park dbpedia-owl:Organisation = schema:Organization dbpedia-owl:Mountain = schema:Mountain dbpedia-owl:GovernmentAgency = schema:GovernmentOrganization dbpedia-owl:Language = schema:Language dbpedia-owl:Lake = schema:LakeBodyOfWater dbpedia-owl:Hotel = schema:Hotel dbpedia-owl:Hospital = schema:Hospital dbpedia-owl:HistoricPlace = schema:LandmarksOrHistoricalBuildings dbpedia-owl:HistoricBuilding = schema:LandmarksOrHistoricalBuildings dbpedia-owl:Film = schema:Movie dbpedia-owl:Event = schema:Event dbpedia-owl:EducationalInstitution = schema:EducationalOrganization dbpedia-owl:Country = schema:Country dbpedia-owl:Continent = schema:Continent dbpedia-owl:College = schema:CollegeOrUniversity dbpedia-owl:City = schema:City dbpedia-owl:Canal = schema:Canal dbpedia-owl:Book = schema:Book dbpedia-owl:BodyOfWater = schema:BodyOfWater dbpedia-owl:Arena = schema:StadiumOrArena dbpedia-owl:Album = schema:MusicAlbum dbpedia-owl:Airport = schema:Airport dbpedia-owl:startDate = schema:startDate dbpedia-owl:starring = schema:actors dbpedia-owl:spouse = schema:spouse dbpedia-owl:runtime = schema:duration dbpedia-owl:relative = schema:relatedTo dbpedia-owl:publisher = schema:publisher dbpedia-owl:producer = schema:producer dbpedia-owl:picture = schema:image dbpedia-owl:parentOrganisation = schema:branchOf dbpedia-owl:numberOfPages = schema:numberOfPages dbpedia-owl:numberOfEpisodes = schema:numberOfEpisodes dbpedia-owl:nationality = schema:nationality dbpedia-owl:musicComposer = schema:musicBy dbpedia-owl:map = schema:maps dbpedia-owl:mediaType = schema:bookFormat dbpedia-owl:location = schema:location dbpedia-owl:locatedInArea = schema:containedIn dbpedia-owl:language = schema:inLanguage dbpedia-owl:isbn = schema:isbn dbpedia-owl:illustrator = schema:illustrator dbpedia-owl:genre = schema:genre dbpedia-owl:foundedBy = schema:founders dbpedia-owl:episodeNumber = schema:episodeNumber dbpedia-owl:endDate = schema:endDate dbpedia-owl:director = schema:director dbpedia-owl:deathDate = schema:deathDate dbpedia-owl:child = schema:children dbpedia-owl:birthDate = schema:birthDate dbpedia-owl:award = schema:awards dbpedia-owl:author = schema:author dbpedia-owl:artist = schema:byArtist uPaul, Awesome! We're collecting mappings to Schema.org at [1] and I'd like to include a pointer to this mapping as well - what is the suggested canonical URI for it? Cheers, Michael [1] mappings uHi Michael, hi all, we don't provide an additional export only for the schema.org mappings. They are part of our ontology, so you can get it via our ontology export: I hope that is adequate for your requirements. regards, paul uPaul Thanks! Will be made available/referenced via a new 'Mapping' tab at [1]. Do you have any description along with it I can use? Cheers, Michael [1]" "Redirects on dbpedia-live include MediaWiki_talk URIs" "uHi, I'm attempting to find the latest dbpedia URI on a subject by following the chain of redirects. I'm using a sparql query directed to On some occasions the response indicates a redirect to MediaWiki_talk URIs. The query, for one of the problem nodes, look like this. select ?link, ?redirect WHERE { ?link . OPTIONAL {?link ?redirect} } Could this be a problem with the data, or am I trying to do the wrong thing? TIA Dave Spacey This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. uMy apologies, I should have included that it returns this as ?link Dave This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this." "Resources types" "uHi, DBpedia resources could be typed with concepts from external ontologies such as UMBEL. I want to get for a particular resource its types, for instance resource Apple have the following concepts : owl:Thing wikidata:Q19088 wikidata:Q756 dbo:Eukaryote dbo:Plant dbo:Species umbel-rc:BiologicalLivingObject umbel-rc:EukaryoticCell umbel-rc:FloweringPlant umbel-rc:Plant. 1. All types are included in the typing even they are subsumed by other concepts, how to get only first concepts (not all the hierarchy), or how to get them in hierarchical order ? 2. It looks like umbel concepts are not identical, if you take a look at umbel-rc:Plant , you find different hierarchical schema. - How does DBpedia attribute UMBEL concepts to its resources, and what is the UMBEL version used ? Regards. Hi, DBpedia resources could be typed with concepts from external ontologies such as UMBEL. I want to get for a particular resource its types, for instance resource Apple  have the following concepts : owl:Thing wikidata:Q19088 wikidata:Q756 dbo:Eukaryote dbo:Plant dbo:Species umbel-rc:BiologicalLivingObject umbel-rc:EukaryoticCell umbel-rc:FloweringPlant umbel-rc:Plant. 1. All types are included in the typing even they are subsumed by other concepts, how to get only first concepts (not all the hierarchy), or how to get them in hierarchical order ? 2. It looks like umbel concepts are not identical, if you take a look at umbel-rc :Plant , you find different hierarchical schema.  - How does DBpedia attribute UMBEL concepts to its resources, and what is the UMBEL version used ? Regards. uBy \"first type\", do you mean the most specific or the least specific? To get the most specific do this. If you have relationships like :a :subClassOf :b . And you know :x a :a . :x a :b . You can delete the second statement because, if you ran the first statement in standard RDFS inference with the rulebox above those statements, you'd get the second statement back. Anyway you can go through the graph and prune away any unnecessary facts until you reach a fixed point. Anyhow there was a literature on problems like this in description logic in the decade before OWL but I don't know about any pre-existing tools that incorporate this feature. Does anyone else know? On Mon, Aug 10, 2015 at 3:53 AM, Nasr Eddine < > wrote:" "DBpedia 3.8 released, including enlarged Ontology and additional localized Versions" "uHi all, we are happy to announce the release of DBpedia 3.8. The most important improvements of the new release compared to DBpedia 3.7 are: 1. the DBpedia 3.8 release is based on updated Wikipedia dumps dating from late May/early June 2012. 2. the DBpedia ontology is enlarged and the number of infobox to ontology mappings has risen. 3. the DBpedia internationalization has progressed and we now provide localized versions of DBpedia in even more languages. The English version of the DBpedia 3.8 knowledge base describes 3.77 million things, out of which 2.35 million are classified in a consistent Ontology, including 764,000 persons, 573,000 places (including 387,000 populated places), 333,000 creative works (including 112,000 music albums, 72,000 films and 18,000 video games), 192,000 organizations (including 45,000 companies and 42,000 educational institutions), 202,000 species and 5,500 diseases. We provide localized versions of DBpedia in 111 languages. All these versions together describe 20.8 million things, out of which 10.5 mio overlap (are interlinked) with concepts from the English DBpedia. The full DBpedia data set features labels and abstracts for 10.3 million unique things in 111 different languages; 8.0 million links to images and 24.4 million HTML links to external web pages; 27.2 million data links into external RDF data sets, 55.8 million links to Wikipedia categories, and 8.2 million YAGO categories. The dataset consists of 1.89 billion pieces of information (RDF triples) out of which 400 million were extracted from the English edition of Wikipedia, 1.46 billion were extracted from other language editions, and about 27 million are data links into external RDF data sets. The main changes between DBpedia 3.7 and 3.8 are described below: 1. Enlarged Ontology The DBpedia community added many new classes and properties on the mappings wiki. The DBpedia 3.8 ontology encompasses • 359 classes (DBpedia 3.7: 319) • 800 object properties (DBpedia 3.7: 750) • 859 datatype properties (DBpedia 3.7: 791) • 116 specialized datatype properties (DBpedia 3.7: 102) • 45 owl:equivalentClass and 31 owl:equivalentProperty mappings to 2. Additional Infobox to Ontology Mappings The editors of the mappings wiki also defined many new mappings from Wikipedia templates to DBpedia classes. For the DBpedia 3.8 extraction, we used 2347 mappings, among them • Polish: 382 mappings • English: 345 mappings • German: 211 mappings • Portuguese: 207 mappings • Greek: 180 mappings • Slovenian: 170 mappings • Korean: 146 mappings • Hungarian: 111 mappings • Spanish: 107 mappings • Turkish: 91 mappings • Czech: 66 mappings • Bulgarian: 61 mappings • Catalan: 52 mappings • Arabic: 51 mappings 3. New local DBpedia Chapters We are also happy to see the number of local DBpedia chapters in different countries rising. Since the 3.7 DBpedia release we welcomed the French, Italian and Japanese Chapters. In addition, we expect the Dutch DBpedia chapter to go online during the next months (in cooperation with and dereferencable URIs for the DBpedia data in their corresponding language. The DBpedia Internationalization page provides an overview of the current state of the DBpedia Internationalization effort. 4. New and updated RDF Links into External Data Sources We have added new RDF links pointing at resources in the following Linked Data sources: Amsterdam Museum, BBC Wildlife Finder, CORDIS, DBTune, Eurostat (Linked Statistics), GADM, LinkedGeoData, OpenEI (Open Energy Info). In addition, we have updated many of the existing RDF links pointing at other Linked Data sources. 5. New Wiktionary2RDF Extractor We developed a DBpedia extractor, that is configurable for any Wiktionary edition. It generates an comprehensive ontology about languages for use as a semantic lexical resource in linguistics. The data currently includes language, part of speech, senses with definitions, synonyms, taxonomies (hyponyms, hyperonyms, synonyms, antonyms) and translations for each lexical word. It furthermore is hosted as Linked Data and can serve as a central linking hub for LOD in linguistics. Currently available languages are English, German, French, Russian. In the next weeks we plan to add Vietnamese and Arabic. The goal is to allow the addition of languages just by configuration without the need of programming skills, enabling collaboration as in the Mappings Wiki. For more information visit 6. Improvements to the Data Extraction Framework • Additionally to N-Triples and N-Quads, the framework was extended to write triple files in Turtle format • Extraction steps that looked for links between different Wikipedia editions were replaced by more powerful post-processing scripts • Preparation time and effort for abstract extraction is minimized, extraction time is reduced to a few milliseconds per page • To save file system space, the framework can compress DBpedia triple files while writing and decompress Wikipedia XML dump files while reading • Using some bit twiddling, we can now load all ~200 million inter-language links into a few GB of RAM and analyze them • Users can download ontology and mappings from mappings wiki and store them in files to avoid downloading them for each extraction, which takes a lot of time and makes extraction results less reproducible • We now use IRIs for all languages except English, which uses URIs for backwards compatibility • We now resolve redirects in all datasets where the objects URIs are DBpedia resources • We check that extracted dates are valid (e.g. February never has 30 days) and its format is valid according to its XML Schema type, e.g. xsd:gYearMonth • We improved the removal of HTML character references from the abstracts • When extracting raw infobox properties, we make sure that predicate URI can be used in RDF/XML by appending an underscore if necessary • Page IDs and Revision IDs datasets now use the DBpedia resource as subject URI, not the Wikipedia page URL • We use foaf:isPrimaryTopicOf instead of foaf:page for the link from DBpedia resource to Wikipedia page • New inter-language link datasets for all languages Accessing the DBpedia 3.8 Release You can download the new DBpedia dataset from As usual, the dataset is also available as Linked Data and via the DBpedia SPARQL endpoint at Credits Lots of thanks to • Jona Christopher Sahnwaldt (Freie Universität Berlin, Germany) for improving the DBpedia extraction framework and for extracting the DBpedia 3.8 data sets. • Dimitris Kontokostas (Aristotle University of Thessaloniki, Greece) for implementing the language generalizations to the extraction framework. • Uli Zellbeck and Anja Jentzsch (Freie Universität Berlin, Germany) for generating the new and updated RDF links to external datasets using the Silk interlinking framework. • Jonas Brekle (Universität Leipzig, Germany) and Sebastian Hellmann (Universität Leipzig, Germany) for their work on the new Wikionary2RDF extractor. • All editors that contributed to the DBpedia ontology mappings via the Mappings Wiki. • The whole Internationalization Committee for pushing the DBpedia internationalization forward. • Kingsley Idehen and Patrick van Kleef (both OpenLink Software) for loading the dataset into the Virtuoso instance that serves the Linked Data view and SPARQL endpoint. OpenLink Software ( for providing the server infrastructure for DBpedia. The work on the DBpedia 3.8 release was financially supported by the European Commission through the projects LOD2 - Creating Knowledge out of Linked Data ( LATC - LOD Around the Clock ( RDF links). More information about DBpedia is found at http://dbpedia.org/About Have fun with the new DBpedia release! Cheers, Chris Bizer" "Filter the results of Lookup" "uDear all, I'm using Lookup WS to query information in DBpedia. The results of Lookup sometimes are too much for me. For example, when I enter \"Administration\", Lookup returns also \"Carter_administration\" while I only want to get the information in the wiki page of \"Administration\", such as \"Administration_(law)\", \"Administration_(business)\", Do you know how to filter the results of Lookup for that or are there any features to distinguish between dbpedia:Administration_(law) and dbpedia:Carter_administration? Thank you so much, Thanh-Tu Dear all, I'm using Lookup WS to query information in DBpedia. The results of Lookup sometimes are too much for me. For example, when I enter 'Administration', Lookup returns also 'Carter_administration' while I only want to get the information in the wiki page of 'Administration', such as 'Administration_(law)', 'Administration_(business)', Do you know how to filter the results of Lookup for that or are there any features to distinguish between dbpedia:Administration_(law) and dbpedia:Carter_administration? Thank you so much, Thanh-Tu uNguyen Thanh Tu wrote: On an orthogonal note, I encourage you to also look at: Your issue comes down to the thorny matter of disambiguation by pivoting across Entity Type and Entity Property dimensions. If you want to make your own Lookup Service, basically, one that has a better UI than our basic UI, you can use the Web Service associated /FCT. Links: 1. VirtuosoURIBurnerSampleTutorial" "type of Bachelor_of_Arts" "uHi Valentina, (and CCing the DBpedia discussion list) this is an effect of the heuristic typing we employ in DBpedia [1]. It works correctly in many cases, and sometimes it fails - as for these examples (the classic tradeoff between coverage and precision). To briefly explain how the error comes into existence: we look at the distribution of types that occur for the ingoing properties of an untyped instance. For dbpedia:Bachelor_of_Arts, there are, among others, 208 ingoing properties with the predicate dbpedia-owl:almaMater (which is already questionable). For that predicate, 87.6% of the objects are of type dbpedia-owl:University. So we have a strong pattern, with many supporting statements, and we conclude that dbpedia:Bachelor_of_Arts is a university. That mechanism, as I said, works reasonable well, but sometimes fails at single instances, like this one. For dbpedia:Academic_degree, you'll find similar questionable statements involving that instace, that mislead the heuristic typing algorithm. With the 2014 release, we further tried to reduce errors like these by filtering common nouns using WordNet before assigning types to instances, but both \"Academic degree\" and \"Bachelor of Arts\" escaped our nets here :-( The public DBpedia endpoint loads both the infobox based types and the heuristic types. If you need a \"clean\" version, I advise you to set up a local endpoint and load only the infobox based types into it. Best, Heiko [1] Am 13.10.2014 02:42, schrieb Valentina Presutti: uHi Valentina, I am not sure whether I understand you correctly. There might be cases of metonymy in DBpedia, but as far as I can see, Wikipedia is usually quite good at separating them via disambiguation pages, I am not sure whether there are too many example. The problem with the degrees, as far as I can tell, is not a metonymy one (degrees are just degrees, I have never seen them used to refer to a university), but simply a series of shortcomings in DBpedia. What happens here inside DBpedia is the following: * First, we find an infobox which says that someone's almaMater is, say, \"Princeton University (B.A.)\". Both Princeton and B.A. are linked to the respective Wikipedia pages. * The extraction framework extracts two statements from that: PersonX almaMater Princeton_University, and PersonX almaMater Bachelor_of_Arts (the second one being an error, which is very hard to avoid in the general case) * Since that happens a few times, we infer that Bachelor_of_Arts is a University. So in that case, I think it's purely a DBpedia problem. If you are aware of any actual cases of metonymy, however, I am curious to hear about that. All the best, Heiko Am 13.10.2014 16:33, schrieb Valentina Presutti: uHi guys, the problem of DBpedia entity types is not a new one. I have discussed some of the problems in the paper [1]. In my opinion it is pretty hard to establish the correct type of an entity using only one source of information. DBpedia uses Infoboxes (both infobox types and recently it performs type inference based on individual properties), YAGO uses categories while Tipalo uses first sentences. But each method has its inherent difficulties. In our recent research we try to combine these methods in order to provide most accurate typing of DBpedia entites. The intermediate results are available at [2]. Still they are not perfect, but we are making progress. E.g. we do not have \"Bachelor of Arts\", but we have: \"Bachelor of Arts in Applied Psychology\" and similar, which are classified as EducationalDegree. We use Cyc and Umbel concepts as types of the entities. If anyone is interested in that research, I am eager to discuss it during the forthcoming ISWC in Italy. Cheers, Aleksander [1] A. Pohl, Classifying the Wikipedia Articles into the OpenCyc Taxonomy [in:] Proceedings of the Web of Linked Entities Workshop in conjuction with the 11th International Semantic Web Conference, Giuseppe Rizzo, Pablo Mendes, Eric Charton, Sebastian Hellmann, Aditya Kalyanpurs (eds.), p. 5-16, ISSN: 1613-0073. [2] uI think the real problem here is that it is you can't make one database to satisfy everyone's requirements. For instance if you want to build a system to do reasoning about academic and professional credentials, this is easiest to build on top of an ontology (data structures) that is designed for the task and with data that is curated for the task. Like other databases, there is some serious overlap with DBpedia (enough that you could populate or enrich a credentials database from DBpedia), but you're always going to fight with the \"notability requirement\" in Wikipedia. Sooner or later there will be concept that is essential to your scheme that is not there. That's no reason not to hook up with DBpedia, but it is to recognize that is not going to please everybody and a big part of the value is of an exchange language uHi Valentina, We're also working on topics that involve relation extraction and had similar problems to yours. Together with colleagues from the Prague University of Economics we presented a solution approach at the last DBpedia Meeting in September, which should be rolling out in the next months. If you set up your own endpoint you can use the Linked Hypernyms Dataset [1], which is also available in English [2] . If you use the LHD 1 dataset you will get results like below. This is due to the fact that the necessary ontology classes don't exist yet in the main mappings wiki, but will probably find their way there soon. < [1] [2] Cheers, Alexandru On Tue, Oct 14, 2014 at 6:03 PM, Paul Houle < > wrote:" "JSON format" "uHi, Maybe I missed it in literature, but I'd like to know what does \"similarityScore\" and \"suppor\" mean in output like this? { \"@surfaceForm\" : \"IBM\", \"@URI\" : \" \"@offset\" : \"275\", \"@similarityScore\" : \"0.11658907681703568\", \"@types\" : \"DBpedia:Company,DBpedia:Organisation,Schema:Organization,Freebase:/business /customer,Freebase:/business,Freebase:/business/brand,Freebase:/computer/pro gramming_language_developer,Freebase:/computer,Freebase:/venture_capital/ven ture_investor,Freebase:/venture_capital,Freebase:/business/employer,Freebase :/education/educational_institution,Freebase:/education,Freebase:/book/book_ subject,Freebase:/book,Freebase:/organization/organization_founder,Freebase: /organization,Freebase:/architecture/architectural_structure_owner,Freebase: /architecture,Freebase:/conferences/conference_sponsor,Freebase:/conferences ,Freebase:/business/sponsor,Freebase:/cvg/cvg_publisher,Freebase:/cvg,Freeba se:/computer/processor_manufacturer,Freebase:/business/issuer,Freebase:/orga nization/organization,Freebase:/business/business_operation,Freebase:/intern et/website_owner,Freebase:/internet,Freebase:/award/ranked_item,Freebase:/aw ard,Freebase:/computer/operating_system_developer,Freebase:/computer/compute r_manufacturer_brand,Freebase:/award/competitor,Freebase:/computer/programmi ng_language_designer,Freebase:/computer/software_developer,Freebase:/award/a ward_presenting_organization,DBpedia:TopicalConcept\", \"@support\" : \"4875\", \"@percentageOfSecondRank\" : \"-1.0\" } Thanks, Srecko uHi Srecko, Can you share from where you retrieved this output? Best, Dimitris On Wed, Feb 27, 2013 at 4:53 PM, Srecko Joksimovic < > wrote: uHi Dimitris, Of course… I just called this method: public String DBPediaSpotLightGetRequest(String conf, String support, String type, String text) throws URISyntaxException{ String cont=\"\"; HttpClient client = new DefaultHttpClient(); URIBuilder builder = new URIBuilder(); builder.setScheme(\"http\").setHost(\"spotlight.dbpedia.org/rest\").setPath(\"/annotate\") .setParameter(\"text\", text) .setParameter(\"confidence\", conf) .setParameter(\"support\", support) .setParameter(\"types\", type); URI uri = builder.build(); System.out.println(uri); HttpGet httpget = new HttpGet(uri); HttpGet request = new HttpGet(httpget.getURI()); request.addHeader(\"accept\", ACCEPTMETHOD); HttpResponse response; try { response = client.execute(request); BufferedReader rd = new BufferedReader (new InputStreamReader(response.getEntity().getContent())); String line = \"\"; while ((line = rd.readLine()) != null) { cont=cont.concat(line); } } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return cont; } What are the rest of parameters? I used text, support, confidence and types… I suppose there is something else as well? Thanks, Srecko From: Dimitris Kontokostas [mailto: ] Sent: Wednesday, February 27, 2013 21:01 To: Srecko Joksimovic Cc: Subject: Re: [Dbpedia-discussion] JSON format Hi Srecko, Can you share from where you retrieved this output? Best, Dimitris On Wed, Feb 27, 2013 at 4:53 PM, Srecko Joksimovic < > wrote: Hi, Maybe I missed it in literature, but I’d like to know what does “similarityScore” and “suppor” mean in output like this? { \"@surfaceForm\" : \"IBM\", \"@URI\" : \" \"@offset\" : \"275\", \"@similarityScore\" : \"0.11658907681703568\", \"@types\" : \"DBpedia:Company,DBpedia:Organisation,Schema:Organization,Freebase:/business/customer,Freebase:/business,Freebase:/business/brand,Freebase:/computer/programming_language_developer,Freebase:/computer,Freebase:/venture_capital/venture_investor,Freebase:/venture_capital,Freebase:/business/employer,Freebase:/education/educational_institution,Freebase:/education,Freebase:/book/book_subject,Freebase:/book,Freebase:/organization/organization_founder,Freebase:/organization,Freebase:/architecture/architectural_structure_owner,Freebase:/architecture,Freebase:/conferences/conference_sponsor,Freebase:/conferences,Freebase:/business/sponsor,Freebase:/cvg/cvg_publisher,Freebase:/cvg,Freebase:/computer/processor_manufacturer,Freebase:/business/issuer,Freebase:/organization/organization,Freebase:/business/business_operation,Freebase:/internet/website_owner,Freebase:/internet,Freebase:/award/ranked_item,Freebase:/award,Freebase:/computer/operating_system_developer,Freebase:/computer/computer_manufacturer_brand,Freebase:/award/competitor,Freebase:/computer/programming_language_designer,Freebase:/computer/software_developer,Freebase:/award/award_presenting_organization,DBpedia:TopicalConcept\", \"@support\" : \"4875\", \"@percentageOfSecondRank\" : \"-1.0\" } Thanks, Srecko" "problem creating a mapping - again." "uHello. A few days ago I sent an email reporting a problem, because I couldn't add mappings with infobox pages that use the character '/' on the link. I tried to solve my problem creating a new infobox page at wikipedia wich is a redirect to the previous one. Now I can create my mapping but I don't have any result when I click on \"test this mapping\" ( I read somewhere in the wiki that I can't use redirect pages, but I have one mapping that is working ( it's a redirect page, so I'm not sure if I can use this or not. I'm asking for some advise for this problem, because in my language all the infobox I want to use have the '/' on the link. Thanks in advance. Best, Vânia R. Hello. A few days ago I sent an email reporting a problem, because I couldn't add mappings with infobox pages that use the character '/' on the link. I tried to solve my problem creating a new infobox page at wikipedia wich is a redirect to the previous one. Now I can create my mapping but I don't have any result when I click on 'test this mapping' ( R. uHi Vânia, On Wed, Jan 5, 2011 at 23:22, Vânia Rodrigues < > wrote: Indeed, you have created a mapping with the name of a redirect (Info_artista_musical) that points to an inbobox template. When extracting data, the framework is looking for templates on Wikipedia pages having this name. If people use the redirect name in Wikipedia instead of the canonical name of the template (Info/Música/artista), it is possible to extract data using your mapping. This is why it works for these examples: But there is no Wikipedia page that uses the redirect name Info_Genero_musical. Every page about music genres in the Portuguese Wikipedia seems to use the canonical name Info/Gênero_musical. Therefore, the test mapping for Info_Genero_musical does not produces any output. We should definitely fix this. Creating mappings for the canonical name of infoboxes is the right thing to do, because then all infoboxes can be extracted: that ones that use the canonical name and the ones that use redirect names (they are resolved to the canonical name). I can propose another work-around until this is fixed. You can download the extraction framework [1] and start it in server mode. In the input box that will appear in your browser, you can enter the URI of the article you would like to extract. For example, I tested it for Rock_and_roll, Samba and House_music of the Portuguese Wikipedia, which all use the template Info/Gênero_musical. I know this is not as convenient as the test mappings, but maybe it is a possibility until we fixed the bug. Cheers, Max [1]" "ORDER BY and utf8 (accents in portuguese names)" "uHi! I'm new to the list and to the whole Linked Data thing altoghether, but will have to work on it for some time so I'll be exploring the DBpedia SPARQL end-point for a project we are working on. The thing is, it is based on Portuguese resources, and I was wondering first of all what is the general feeling on working with DBpedia on other-than-english resources. For example, (and this is my concrete question), is there a way to correctly order results so accented characters are also correctly ordered? If I try this example query on SELECT ?label WHERE { [] skos:subject ; rdfs:label ?label . FILTER ( langMatches(lang(?label), \"PT\") ) } ORDER BY ?label I get this results Notice that at the end Urbano Tavares Rodrigues Vasco Graça Moura Vitorino Nemésio Álvaro Cunhal \"Álvaro\" gets incorrectly ordered at the end. Is this dependent on the underlying store (much like SQL collations)? Is there something that can be done on my side to get this query ordered like it should be for Portuguese / UTF8 (some notation in SPARQL or specific to DBpedia / Virtuoso )? Thanks and keep up the good work!! This is a truly impresive project! Regards, Alex uHi, I would think that it is store and implementation dependent. Does a specified UTF-8 character order exist? Did you try it on other stores or Jena? (I just realize that I'm not answering your question, but adding more questions on top ;) There is a paper at ISWC, that might be interesting also: Sören Auer, Matthias Weidl, Jens Lehmann, Amrapali J. Zaveri, Key-Sun Choi: I18n of Semantic Web Applications. Regards, Sebastian Am 26.08.2010 16:08, schrieb Alex Rodriguez Lopez: uHi Sebastian, thanks for the link to the article on i18n of semantic web apps, any resource about i18n is greatly appreciated since it is generally regarded as a \"less-important-than-other-things\" issue. About UTF-8 ordering, I was refering to collations RDBMS like MySQL tend to use with UTF-8: But after all it seems there is a defined \"stardard\" order or collation for UTF-8: I didn't try the query on other stores or Jena, so I can't tell if there is any difference. I'll post findings when I do. Regards, Alex" ""Attention data" in DBpedia?" "uHi, I was wondering if the following data is available anywhere as part of DBpedia, or otherwise if there's any hope of getting it from DBpedia in the future. I think, but I'm not sure, that the raw data should be availabe in the Wikipedia database dumps. 1. View counts for Wikipedia pages. 2. Total number of edits for each Wikipedia page. 3. Inlink counts for Wikipedia pages. The first two are attention data. That's an interesting aspect of Wikipedia that isn't fully exploited yet. There are interesting applications where I could learn stuff about my own dataset by meshing it up with attention data from DBpedia. The third one is, in some way, also a measure of attention, and can be useful for ranking. (I'm thinking about stuff that can be done with the New York Times SKOS dataset, and using attention data from Wikipedia to gain insight into the NYT data might be quite interesting.) So, any hint about how to get the data above would be appreciated. Best, Richard uHi Richard, I'm pretty sure that the first two are not available in the Wikipedia dumps. For example, [1] lists pages-meta-current.xml.bz2 as \"All pages, current versions only.\" I don't think there is a dump of all pages. There once may have been one, but it probably became too big. For the view count, see [2]. But hey, I also found the following at [3]: \"Domas Mituzas put together a system to gather access statistics from wikipedia's squid cluster and publishes it here\" [4]. The inlink count can of course be extracted, either from the Wikipedia dump [5] or from DBpedia [6]. I wrote a bit of Java code that does exactly that because I needed it for the faceted browser [7], but didn't publish the results. I can send you the code if you want, though it's not really \"productized\". Cheers, Christopher [1] [2] [3] [4] [5] [6] [7] On Tue, Nov 10, 2009 at 23:26, Richard Cyganiak < > wrote: uSilly me. I meant \"of all revisions\", but anyway that's not what you need. I think the edits are in this dump: The largest of all those huge files at 11 GB. On Tue, Nov 10, 2009 at 23:54, Jona Christopher Sahnwaldt < > wrote: uHi Richard, On 10 Nov 2009, at 17:26, Richard Cyganiak wrote: The SIOC MediaWiki exporter [1] can be used for Wikipedia and exposes recursive links to all the edits of each wiki page, and links to other pages (both internal and external) for each version, cf [2]. Consequently, you can easily get 3 by retrieving the page with that exporter, and counting the links. However, currently, getting the number of edits implies recursive fetching and may be relatively long. Yet, if there is a way to directly get the number of edits via the MediaWiki API, we could add that feature to the export, same for requirement 1 (cc-ing Fabrizio that worked on it as I'm not aware if these features are in the API or not). In addition, I think that to efficiently link that information to DBPedia, it would require that DBPedia provides information about the version of Wikipedia pages that have been used for the export (e.g. version number / ID). That way, we could accurately link a DBPedia resource to the SIOC description of the corresponding Wikipedia page that has been used to extract this DBPedia information. Would it be something that can be done by the DBPedia team ? (e.g dbpedia:createdFrom -> wikipedia page + seeAlso to the SIOC-exported version to enabled interlinking) Best, Alex. [1] [2] uOn Wed, Nov 11, 2009 at 00:07, Alexandre Passant < > wrote: The live extraction already does that, e.g. contains 3354 and Christopher uOn 10 Nov 2009, at 18:28, Jona Christopher Sahnwaldt wrote: wow, excellent, thanks ! Related SIOC page can then be immediately extracted: Alex. uHi Fabrizio, On 14 Nov 2009, at 18:18, Fabrizio Orlandi wrote: Unfortunately, as Christopher already said, the counter value is not being updated in the database, so it's in a state from some years ago, and therefore pretty useless. (Barack Obama has 0, for example.) Thanks Fabrizio. Best, Richard" "Editor rights for dbpedia mappings" "uDear dbpedia authors,I'm a full time professor at the University of applied sciences in Hof, Germany. As part of my research on unified information access, I'd like to add an ontology class \"SnookerPlayer\" as a subclass of Athlete and also add an Infobox mapping for the English Wikipedia. It's only a small number of individuals in the according categories, but I'd like to get an understanding of how it works. From my perspective, the best thing is to contribute in order to understand it from inside out.Since there is currently the snooker world championship taking place, I thought I'd start with the category advanceRené" "limitation of offset in sparql request ?" "uThe below request has permanent error 500 PREFIX rdfs: SELECT ?l WHERE { [] rdfs:label ?l. } LIMIT 1000 OFFSET 11575400 I starting dumping label from offset 0 to 11575400 step of 1000. at this point I have not all label, why it blocks at 11575400 ? Best regards Luc DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" The below request has permanent error 500 PREFIX rdfs: < ?l. } LIMIT 1000 OFFSET 11575400 I starting dumping label from offset 0 to 11575400 step of 1000. at this point I have not all label, why it blocks at 11575400 ? Best regards Luc uHi Luc, There was a problem on the DBpedia Server which has been resolved and the query now runs Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 22 Aug 2011, at 14:46, luc peuvrier at home wrote:" "Missing Ancient Greek with type Language" "uI was exploring around today using the dbpedia-owl types for an app I'll be working on soon. It will ask teachers to specify the languages they teach by picking relevant wikipedia pages. So far everything looks great, but when I did queries for Greek it looks like Ancient Greek was missing: PREFIX dbpedia-owl: PREFIX dbprop: PREFIX rdfs: SELECT DISTINCT ?lang ?fam WHERE { ?lang a dbpedia-owl:Language ; rdfs:label ?name . OPTIONAL { ?lang dbprop:fam ?fam . } FILTER (REGEX (?name, 'greek', 'i') ) } Gets me Modern_Greek, Cappadocian_Greek_Language , and Greek_Language, but not Ancient_Greek. The dbpedia page template box, but not that it got the Language type. Does this look like a bug, or something that was missed in the manual work that went into creating the ontology? Thanks much! Patrick Murray-John http://semantic.umwblogs.org uHi Peter, A bit of both :) The Wikipedia article Ancient_Greek used the template:Language instead of Template:Infobox_Language. Template:Language is a redirect to Infobox_Language, so the article content displays correctly. We had no automated approach to include template-redirects, and we only included the most common template-redirects manually. And missed the template:Language. I've just simply changed the Wikipedia article Ancient_Greek to use template:Infobox_Language now, so we will be fine with the next extraction (will it be the next dataset release once there is a new Wikipedia dump, or via live-update if we manage to implement that soon :) Best, Georgi PS: We should follow up on the conversation we had in London in March about including your work with ChemBoxes into DBpedia uGeorgi, Ahthat makes sense! I did a query to check for other articles with the same issue, and it turned up quite a list" "DBpedia live updates" "uHi all, DBpedia Live updates temporarily stopped. The reason is related to and breaks how the live servers handle authentication to get the updates stream. We are working on a fix. Best, Dimitris" "questions about the extraction process" "uHi everybody, I just came across DBPedia and it looks like a really awesome project. I noticed that the latest dump from the English Wikipedia is quite old (January 2008, Freebase does a new release every 3 months), so I am interested in creating a more up-to-date dbpedia from a recent download of the Wikipedia pages and articles (which I have). Is this possible? I am most interested in knowing which Wikipedia pages are people, companies, and possible disambiguations. Perhaps if there are URLs of the person (or company logo) on Wikipedia, that would be good too. I downloaded the SVN from sourceforge, but I'm afraid I am a little lost as I was unable to find any documentation. I have a suspicion that the entry point is the extraction/extract.php filebut I have no idea where to put the bz2 wikipedia dump file. Additionally, I noticed that the PersonData preview link is broken. This is disappointing as it is the one I am most interested in (so I had to download the full dataset). Is there any reason why this is only created from the German data? Shug uHello, Shug Boabby schrieb: Yes. You could use the YAGO hierarchy for this (which will be much improved in the next DBpedia release). Using subclass inferencing in Virtuoso, you can ask the DBpedia SPARQL endpoint for all individuals, which are instances of the class you are looking for, i.e. ask for instances of persons. (Note that there are resource limits on the server, so I believe it will only return 1000 results at a time. You need to ask several queries to get all persons.) You need to import all the dumps in a MySQL database. There is an import PHP script provided to do this. (Also see other threads in this list.) The extract.php file is the entry point for the complete extraction process. You may alter it to extract only what you are looking for. Some developer information about the framework can be found on documentation on creating your own DBpedia release. It is a very time-consuming process. I fixed it. There is no strict reason not to include it. I believe the extractor had problems on the English Wikipedia, which need to be fixed. It might be included in the next release. Kind regards, Jens uHello, At the time we started the project (December 2006), the German Wikipedia hold much more structured information in the Persondata template than the English Wikipedia. This might have changed in the meantime, so it's time to extract that data from the English Wikipedia as well. Cheers, Georgi" "bug #1744807: Complex infobox values" "uHello, The bug ( concerns unparsed values in the Wikipedia infobox dataset (by my count, it's in about 1.5% of the triples in infobox.nt). Has anybody discussed ideas for parsing this data? I understand the current effort is focused on pushing out the updated dataset, but I just thought I'd share my thoughts on this subject. In some cases, it is just marked-up text, but in others, it's data that would be useful to parse (e.g., dates like \"[[March 7]], [[322 BC]]\", \"[[Milton, Massachusetts|Milton]], [[Massachusetts]]\" and date ranges). A simple yet reliable rule seems like it would be a challenge, but I have a few observations: 1. Dates and date ranges should be pretty easy to parse, although how would you store date ranges that have multiple values? Does it make sense to split such property into two triples (start and end)? 2. Look for property names that include the word\"date\" to try different mechanisms to parse dates from it. 3. In many cases, just parsing the first chunk of text is at least approximately correct (as a \"primary\" result). For example, getting \"Barbara Liskov\" out of: \"[[Barbara Liskov]] and her students at [[MIT]]\" . 4. Location-oriented properties are typically hierarchical (e.g., \"[[Corsica]], [[France]]\") I understand that some magic may still be involved in detecting and correctly parsing these cases, but I throw it out there anyway. In any case, I have some time to help get the new dataset out, let me know how I can assist. Rich uHi Rich, Piet has worked on unparsed values in the Wikipedia infobox dataset and I think he did a great job on improving the extraction code for them. He will be back in Berlin tomorrow and will check wheter he has allready implemented your ideas or if they are still missing from the code. Cheers Chris uRich Knopman wrote: I honestly don't think it makes much sense to spend time to find a solution for that in DBpedia. I rather think the problem should be tackled in Wikipedia itself. I think a fundamental paradigm of DBpedia should be not to change or alter any meaning of information extracted from Wikipedia. Exactly this would happen, if we omit stuff between list items in literals containing lists or enumerations. Also, how do you envision \"[[Barbara Liskov]] and her students at [[MIT]]\" to be represented? I think parsing here does not make sense, some people will want to have the literal rendered as HTML, some as pure text. If you want to ask, where \"Barbara Liskov\" is involved, you can also make a fultext-search on the literal data. On the Wikipedia side however, it will be easy to find cleaner and more meaningful representations (e.g. using sub-templates) and I think DBpedia should not aim for a quick hack but rather for longterm solutions. Anyway, Rich thanks a lot for sharing your ideas and I hope you find ways to contribute further to DBpedia. uSören, I agree with your thoughts. Fixing things at the source improves consistency and simplifies extraction for Dbpedia. Has there been talk in the Dbpedia community about interacting with the Wikipedia community to make the data more consistent? Do we have any examples of this to date? Dbpedia certainly provides the basis for tools that allow people to police infobox consistency, if we decide to promote that. I also agree with not changing the data in Wikipedia. The [[Barbara Liskov]] example below is really a list and the data within should be retained (perhaps as two values). The intent of the point below really concerns annotation cases. For example with the triple (:Renaissance_Center, dbpedia2:floor_count, \"73 story tower with four 39 story towers and two 21 story towers\"), I'd like to be able to retrieve the value itself (e.g., 73) and have the rest of the text available as an annotation or comment. Although again, fixing this in the Dbpedia extraction process would be problematic. Rich it's in about 1.5% of the triples in infobox.nt). Has anybody discussed focused thoughts on this subject. solution for that in DBpedia. I rather think the problem should be tackled in Wikipedia itself. alter any meaning of information extracted from Wikipedia. Exactly this would happen, if we omit stuff between list items in literals containing lists or enumerations. [[MIT]]\" to be represented? I think parsing here does not make sense, some people will want to have the literal rendered as HTML, some as pure text. If you want to ask, where \"Barbara Liskov\" is involved, you can also make a fultext-search on the literal data. meaningful representations (e.g. using sub-templates) and I think DBpedia should not aim for a quick hack but rather for longterm solutions. ways to contribute further to DBpedia." "can't create template mapping with bengali namespace" "uHello, I want to create template mapping with bengali namspace (bn). I have created an account in have followed the instructions under the heading \"*Create new mappings*\" on the following page: I wan to map Infobox:film for bengali wikipedia so i tried to enter the following url into the browser : there is no mapping exists for the box. I want to \"create\" and start writing, but there is no create link. So what should I do now? I have also tried the Mapping Tool for Bengali language directly to do the above said job. this time mappings works. I saved the mapping code clicking on the \"send to DBpedia\" button under the \"output\" successfully. But later i didn't found the saved mapping code anywhere. Can anybody please guide me through the whole process and tell me what i have missed? Please. Thank you. ArupHello, I want to create template mapping with bengali namspace (bn). I have created an account in mappings works. I saved the mapping code clicking on the 'send to DBpedia' button under the 'output' successfully. But later i didn't found the saved mapping code anywhere. Can anybody please guide me through the whole process and tell me what i have missed? Please. Thank you. Arup uWhat is your username? Best, Pablo On Mon, Oct 31, 2011 at 12:07 PM, Arup Sarkar < > wrote: uHello Mr. Mendes, Thanks for the reply. My username is \"Arup\" and user ID is 233. Arup On Mon, Oct 31, 2011 at 7:43 PM, Pablo Mendes < > wrote: uYou have now been made editor. Happy mappings! Cheers Pablo On Oct 31, 2011 7:57 PM, \"Arup Sarkar\" < > wrote: uI meant: happy mapping! :) Cheers Pablo On Oct 31, 2011 8:43 PM, \"Pablo Mendes\" < > wrote: uThank you very much Arup On Tue, Nov 1, 2011 at 1:14 AM, Pablo Mendes < > wrote:" "How to get extended summary" "uPlease help . I have been doing this . SELECT str(?abstract) WHERE { < FILTER (langMatches(lang(?abstract),\"en\"))} uDear Somesh, you have a bug in your query. Try SELECT str(?abstract) WHERE { ?abstract FILTER (langMatches(lang(?abstract),\"en\"))} Best, Heiko Am 05.07.2012 16:37, schrieb Somesh Jain: uTry SELECT ?abstract WHERE { ?abstract FILTER (langMatches(lang(?abstract),\"en\"))} Am 05.07.2012 16:37, schrieb Somesh Jain: uHi Somesh, On 07/05/2012 04:37 PM, Somesh Jain wrote: if I understand what you want to do correctly, then you should just replace with" "DBpedia ontology property mappings" "uI'd like to have a list of the mappings from Wikipedia infobox strings and template attributes to dbpedia properties. Is this available or the bits and pieces from which I can generate it? For example, I'd like to know that in some cases (e.g., for Barack_Obama) an infobox uses the string 'Born' and markup attribute 'birth_date' for a property that is represented in RDF as 'dbpedia-owl:birthdate' and also as 'dbpprop:dateOfBirth'. If I also knew the associated infobox template, I guess that might be helpful also. I am a little confused by the different dbpedia properties that are associated with some infobox attributes. for example Barack_Obama also has an RDF property 'dbpprop:birthDate' with value 'dbpedia:Barack_Obama/birthDate/birth_date_and_age'. What does this represent? I'm sorry that this is a bit vague. Ultimately, I am trying to enumerate possible ways of expressing different dbpedia-owl properties in text and I thought this might be a useful source of data to look at. Tim uHi Tim, the bits and pieces are available from the mappings themselves are available as Excel file mapping.xls and there is some documentation at The process we were using was to export the mapping from the xls into a relational database which is then used by the infobox extraction code to translate infobox property names to terms in the DBpedia ontology. The process as well as the mappings themselves are still far away from being optimal and we plan to design a proper RDF language to represent the mappings and implement a UI so that external contributors can help to update and extend the mappings. Due to a lack of resources this goes on slow but we hope to make some progress in the next months. Up-to-then, all we can offer are the bits and pieces listed above. Cheers Chris" "Another question about conversions from wikipedia to dbpedia" "uHi, Please take a look at this wikipedia page: Why is it converted into triples in dbpedia of the sort: The Harvard university people don't have the property name of each of its members, but this is what we ended up with in dbpedia. More specifically the semantics of the name property seem to me to be confused here (I am not talking about its intuitive meaning to us but the meaning that comes from consistent correct use). For example the name property is used to map a URI of a person to the string of their name such as: \"The Hon. Sir Edward Kenny\" I am interested in getting some feedback on thisMy last post on infoboxes was unanswered so if you are kind enough to look at that too, I would be very thankful! Marv" "scala extraction framework for Russian wiki" "uHello, I'm doing an \"answer with facts\" project - if a user asks something which can be answered with a simple fact (\"what's the population of USA?\"), I answer him/her using dbpedia. For the facts to be more precise, the fact database needs to be updated more often than dbpedia itself. For that, I download recently modified Wikipedia pages and parse them into dbpedia-like pages. The php extraction framework seems to work for this, however - as far as I've understood - the scala framework is faster and generally better. Sadly, when I try to run it, an exception occurs: \"No mappings available for language ru\". Is there really no way to use the scala framework for the Russian wikipedia, or am I missing something? Thank you in advance and sorry if that's a stupid question :) Sincerely yours, Sonya Alexandrova uHi Sonya, there are two different extractors for Infoboxes: the MappingExtractor and the InfoboxExtractor. The MappingExtractor uses mappings defined on but if you are motivated to create some, let us know, you are more than welcome to. ;) The InfoboxExtractor uses our initial, now three year old infobox parsing approach. This extractor extracts all properties from all infoboxes and templates within Wikipedia articles, independently of the language and should also work for Russian. Extracted information is represented using properties in the namespace. However, the quality of these properties is, in general, worse than the mapping based properties. See also: Best, max On Thu, Jul 1, 2010 at 10:42 AM, Sonya Alexandrova < > wrote:" "Indian-summer school on Linked Data (ISSLOD 2011)" "u*Indian-Summer School on Linked Data* Leipzig, Sep 12-18, 2011 ISSLOD takes place in late summer with hopefully still a lot of Indian Summer (i.e. Altweibersommer / Бабье лето) sunshine rays. The Linked Data methodology is a light-weight approach to facilitate the transition from the document Web to the Web of Data and ultimately a Semantic Web. With a wide availability of Linked Data tools and knowledge bases, a steadily growing R&D; community, industrial applications, the Linked Data paradigm already became crucial building block of the Web architecture. ISSLOD is primarily intended for postgraduate (PhD or MSc) students, postdocs, and other young researchers investigating aspects related to the Semantic Data Web. The Summer School will also be open to senior researchers wishing to learn about Semantic Web issues related to their own fields of research. For further details please visit: ISSLOD is organized by the EU-FP7 project \"LOD2 - Creating Knowledge out of Interlinked Data\". Lecturers comprise distinguished experts from LOD2 member organizations as well as invited speakers, the majority of which will - apart from their lectures - also be present for the duration of the school to interact with students. Interaction with senior researchers and establishing contacts within young researchers is a main focus of the school, which will be supported through social activities and an interactive, amicable atmosphere. ISSLOD Application Deadline: 30 July 2011 Notifications: 5 August 2011 ISSLOD: 12-18 September 2011 There will be a limited number of student grants available. Details of the registration process will be announced on the Web site, after the application deadline. We will keep the registration fee low (175 EUR) and provide reasonable accomodation packages (less than 40 EUR per night) for students. u*Indian-Summer School on Linked Data* Leipzig, Sep 12-18, 2011 ISSLOD takes place in late summer with hopefully still a lot of Indian Summer (i.e. Altweibersommer / Бабье лето) sunshine rays. The Linked Data methodology is a light-weight approach to facilitate the transition from the document Web to the Web of Data and ultimately a Semantic Web. With a wide availability of Linked Data tools and knowledge bases, a steadily growing R&D; community, industrial applications, the Linked Data paradigm already became crucial building block of the Web architecture. ISSLOD is primarily intended for postgraduate (PhD or MSc) students, postdocs, and other young researchers investigating aspects related to the Semantic Data Web. The Summer School will also be open to senior researchers wishing to learn about Semantic Web issues related to their own fields of research. For further details please visit: ISSLOD is organized by the EU-FP7 project \"LOD2 - Creating Knowledge out of Interlinked Data\". Lecturers comprise distinguished experts from LOD2 member organizations as well as invited speakers, the majority of which will - apart from their lectures - also be present for the duration of the school to interact with students. Interaction with senior researchers and establishing contacts within young researchers is a main focus of the school, which will be supported through social activities and an interactive, amicable atmosphere. ISSLOD Application Deadline: 30 July 2011 Notifications: 5 August 2011 ISSLOD: 12-18 September 2011 There will be a limited number of student grants available. Details of the registration process will be announced on the Web site, after the application deadline. We will keep the registration fee low (175 EUR) and provide reasonable accommodation packages (less than 40 EUR per night) for students. On behalf of the LOD2 project and AKSW research group Sören" "Companies" "uHello, I was wondering if somebody could provide some guidance on how I might resolve simple company names like \"Apple\" or \"International Business Machines\" to dbpedia resources a la \"/resources/Apple\". Once I perform that translation, which dataset contains all information about Apple? I was browing through the Downloads section of the website, but none of the datasets appear to contain information about companies (headquarters, # of employees, stock ticker, etc). I know this information is available because I can pull up: which is quite thorough. I could, of course, pull it off the server directly, but I'm looking to have all of that information local. In summary: what's the best way to translate a company name to a dbpedia resource and what dataset actually contains the information shown in that URL for company resources? uRobert, I'd guess, Wikicompany [1] might be of help. Cheers, Michael [1] uOn Thu, Nov 11, 2010 at 6:32 AM, Robert Campbell < > wrote: Did you run across quite good, and it has a nice xml webservice, e.g. Once you get the URL for the resource you want, you can resolve it and dig into the RDF to see what's there. There is also the SPARQL endpoint too [1] for when you get familiar with the RDF data that's in dbpedia. //Ed uHi Robert, On Thu, Nov 11, 2010 at 12:32 PM, Robert Campbell < > wrote: We are in the process of developing a disambiguation framework for DBpedia that will be released still this year. For your case it can be used to find the company that is meant in a given string and a *context*. The context is important since there can be more than one meaning for \"Apple\" (for example use of the word \"Apple\" should be clear (for example \"Apple produces the iPad.\"). So you could give our new tool a text that contains a company name that you would like to link to a DBpedia resource and it will return the corresponding URI. Of course, this can be done with other types of entities as well, not only with companies. The datasets are not sorted by type. To get the information of all companies you would have to load the Ontology Instance Types and the Ontology Infobox Properties. Or you could ask the SPARQL end point ( Best, Max uThanks Ed. Is there any way to do all of this offline? I assume since dbpedia provides datasets for download, I should be able to have an offline RDF database containing everything I need. I'm guessing the lookup service is online only, but I could try to find alternatives for that piece. On Thu, Nov 11, 2010 at 12:55 PM, Ed Summers < > wrote: uOn 11/11/10 6:32 AM, Robert Campbell wrote: uYou could use the Freebase data dumps to narrow down what you're looking for and then go to DBpedia for any missing information. They're down weekly and include both DBpedia IDs as well as the original Wikipedia article number, so you can easily link to either. Anything with the type /business/business_operation should be a company, division, subsidiary, etc. You can use the data for anything you want as long as you provide attribution. Tom On Thu, Nov 11, 2010 at 6:58 AM, Robert Campbell < > wrote: uOn 11/11/10 11:55 AM, Tom Morris wrote: Tom, Where are the RDF format dumps from Freebase? If they aren't delivering RDF dumps, of what value are these dumps to someone working with Linked Data? Note, I don't think people are expecting to make translations when the initial goal was to reconcile companies leveraging common data representation. Hopefully, you will prove me wrong and unveil their RDF dumps. Note, I've requested these dumps repeatedly from freebase (pre acquisition) and basically stopped asking. Kingsley uOn Thu, Nov 11, 2010 at 12:32 PM, Kingsley Idehen < > wrote: As far as I know they provide an RDF end point, but not RDF dumps. The wiki page that I linked to has detailed information on the dump formats (quads similar to N3 triples and a lightly processed Wikipedia dump format called WEX). I've actually heard some people preach that linked data isn't solely about RDF. In this particular case the request was for _local data_ about _companies_. There was no linked or Linked or RDF aspect to it. Don't get me wrong. I think Freebase RDF dumps would be nice. It just doesn't have anything whatsoever to do with what the user asked. As long as we're talking about standards though, both DBpedia and Freebase use largely private schemas/vocabularies, so even if you were to get things in RDF you would still have a ton of non-standard stuff to deal with. Tom uOn 11/11/10 1:09 PM, Tom Morris wrote: Yes, I am aware of the SPARQL endpoints. Similar != Equivalent :-) Again, this I know, I am very familiar with Freebase (tech and personnel). Yes, I am one such individual. I asked: do they have an RDF dump. If they offer an endpoint (that emits RDF output) what's the problem with a dump? Local data with DBpedia meshing in mind. You can dump all of part of DBpedia to a local drive or DBMS (e.g. quad or triple store). I think it does since I don't think the user (who posted to DBpedia forum) has data translation in mind. Of course note. As demonstrated by the Freebase endpoint. It has a boat load of cross references to DBpedia that I can obtain in RDF formats :-) Kingsley uOn 11/11/2010 12:32 PM, Kingsley Idehen wrote: I believe the content of rdf.freebase.com can be generated by a very simple algorithm from the quad dump. I think it would be less than a week of work to code something up that downloads the quad dump, transforms it into triples, and outputs a big NT file. I'm not going to do it unless somebody pushes a few G's my way because I don't need it uOn 11/11/10 2:31 PM, Paul Houle wrote: It's much less that than using Virtuoso's sponger or the Virtuoso crawler + Sponger etc That's a simple export to N number files post ingestion. uTry with the OKKAM Entity Name System (ENS) you can search by company name and you can retrieve the DBPedia URI and additional URIs for the resource. You can also narrow down the search on organizations. API access is also available. Best, Angela. On Thu, Nov 11, 2010 at 10:00 PM, Kingsley Idehen < > wrote:" "How can i exit DBpedia mailing list ?" "uHi ! How can i exit DBpedia mailing list ? uYou can use this link On Wed, Oct 30, 2013 at 8:45 PM, xovcia00 < >wrote: uOn Oct 30, 2013, at 02:45 PM, xovcia00 wrote: The headers of every sourceforge list message include several links that may be helpful to you" "About the license over external links" "uHi everyone, I am looking for the license of the different data available on dbpedia.org. As I saw all the data extracted from Wikipedia, as well as the data from What about the links to external resources, like Yago, or Factbook ? Thanks, Julien" "FYI: What is an ORCID ID?" "uI recently posted a question about ORCID identifiers in DBpedia. It occurs to me that some of you may not have encountered the term before; and that many of you will be eligible to use one. An \"Open Research Contributor Identifier\" (ORCID; ), is a UID (which can be expressed as a URI) for researchers and academic and other authors. Think of it as a DoI for people. Mine is shown below. An ORCID disambiguates people with the same or similar names; and identifies works by the same author under different names (changes on marriage, divorce; different spellings or initialisations, etc.) as being by that one person. As the website says: \"ORCID is an open, non-profit, community-driven effort to create and maintain a registry of unique researcher identifiers and a transparent method of linking research activities and outputs to these identifiers\". Several journals and publishers, not least Nature, are including ORCID in their publishing workflows, and institutions are including it in their staff records system. Individuals can sign up for an ORCID at and then include it in their attribution in their research papers, other publications, correspondence and stationery. I encourage you to do so." "Attribution to thumbnail creators" "uMany of the thumbnail-Links in DBpedia point to photos under a CC-by-sa license, e.g. dbpedia-owl:thumbnail ; The according Wikipedia page links to , which includes license, author and attribution information ( However, I could not find this information in DBpedia, and therefore I'm not able not give due credits to the thumbnail author when working with the DBpedia data. Did I miss something? Otherwise, is this a known issue? Are there any workarrounds? Are you considerung to include author/attribution information for thumbnails? Cheers, Joachim DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" Many of the thumbnail-Links in DBpedia point to photos under a CC-by-sa license, e.g. < dbpedia-owl:thumbnail < ; The according Wikipedia page links to which includes license, author and attribution information ( . However, I could not find this information in DBpedia, and therefore I'm not able not give due credits to the thumbnail author when working with the DBpedia data. Did I miss something? Otherwise, is this a known issue? Are there any workarrounds? Are you considerung to include author/attribution information for thumbnails? Cheers, Joachim uHi Joachim, the image dataset files contain triples that link the images with the Wikipedia upload page. This should provide a workaround / solution for your problem. For example, the first few lines of images_en.nt.bz2 [1] contain these triples: . . . . . Similar for . The public endpoint for example: select ?p, ?o where { ?p ?o } returns: Maybe these links to the upload pages should also be included in Cheers, JC [1] On Tue, Jun 19, 2012 at 2:17 PM, Neubert Joachim < > wrote: uHi Jona Christopher, Thank you very much for your hints - great to hear that the data is available. The triple . should be a good fit for a foto credit, with \"English Wikipedia\" as link text (I suppose, the authors would be forgiving that their names are not mentioned directly). As we currently look up the resource page, I'd strongly support your consideration - it puts less stress on Thanks again - Joachim" "DBpedia Data Quality Evaluation Campaign - Results and Winner" "uDear all, We would like to thank all the participants who took part in the DBpedia Data Quality Evaluation Campaign. A total of 58 participants evaluated 521 distinct resources. This campaign has helped in identifying the major data quality problems in DBpedia which will help us improve the quality in the long run. We would also like to announce the lucky winner: Maxim Kolchin !!! Congratulations !!! Here is a link to a manuscript which describes the methodology and presents the results: Thank you. Regards, DBpedia Data Quality Evaluation Team. dbpedia-data-quality" "Editor rights for mapping" "uI there, how can i get the editor rights to edit a mapping in Portuguese language? My username is “Cesararaujo”. Thanks, César Araújo I there, how can i get the editor rights to edit a mapping in Portuguese language? My username is “Cesararaujo”. Thanks, César Araújo uHi César, You now have editor rights: happy mappings! Roland On 24-03-14 23:00, César Araújo wrote:" "Parentheses in PrefixedName" "uHello, I'm trying to run some SPARQL queries. Consider these two queries that should yield the same output: First query: PREFIX dbpedia: SELECT * WHERE { ?b } Second query: PREFIX dbpedia: SELECT * WHERE { dbpedia:Time_(magazine) ?b } The first query will run just fine, the second query will fail with \"37000 Error SP030: SPARQL compiler, line 1: syntax error at '(' before 'magazine'\". Any clue as to why this happens? Do I need to encode the parentheses? Queries without parentheses work fine in both cases. Regards, Michael uHello Michael, Actually, both should not work. The URL should be encoded, so either or :Time_%28magazine%29 I was actually surprised to also find in DBpedia, so they are different URIs, which is obviously wrong. I think the yago dataset has bad url encoding and should be fixed. Regards, Sebastian, AKSW Michael Haas schrieb:" "using CombineDateMapping to combine 3 fields" "uHi, What is the correct way to combine 3 infobox fields (day, month, year) into a datetime property such as birthDate? I tried to use the CombineDateMapping template but [looking at the sample extractor output] it does not work: {{CombineDateMapping | templateProperty1 = dz_diena | unit1 = xsd:gDay | templateProperty2 = dz_mēnesis | unit2 = xsd:gMonth | templateProperty3 = dz_gads | unit3 = xsd:gYear | ontologyProperty = birthDate }} Testing: - look at the 1st object shown in \"Test this mapping\" (Aizeks Azimovs). It has these 3 fields defined and should extract a birthDate property but it does not. P.S. All other uses of CombineDateMapping that I found only combined 2 properties (day-month + year). Thanks, Uldis Hi, What is the correct way to combine 3 infobox fields (day, month, year) into a datetime property such as birthDate? I tried to use the CombineDateMapping template but [looking at the sample extractor output] it does not work: Uldis uCould you help me with a problem with CombineDateMapping template for LV DBPedia (see below)? I asked this question during the mapping sprint but did not get a reply." "Integrating RML in the DBpedia extraction framework" "uStudent: Wouter Maroy Mentors: Anastasia Dimou, Dimitris Kontokostas TL;DR; The goal of this GSoC project was to start the integration of RML ( (that all were completed successfully): To read this in a nicely formatted way click here: Introduction DBpedia uses it’s own defined mappings for extracting triples from Wikipedia. The goal of this project was to integrate RML, a general mapping language for triples and replace the original mappings with RML mapping documents. In terms of goals, this project had two main goals and one optional goal. Main goals: - Translate the DBpedia defined mappings to RML mapping documents - Importing RML documents into the extraction framework and converting them to the existing DBpedia mapping data structures Optional goal: - Create a prototype of an integrated RML processor in the DBpedia extraction framework The project was a success. All goals of the project (including the optional goal) were completed and generated successful results. First goal: translating the DBpedia mappings to RML mappings DBpedia uses different types of custom mappings (e.g. simple property mappings, date interval mappings) for extracting triples from Wikipedia infoboxes. These are in general quite complex. Creating one-on-one mappings from DBpedia mappings to RML mappings was no easy task. Designing these mappings required quite some time during the project. We wanted this to be very accurate because the better these translations are, the better the results will be in the end of the process. To create the alignment it was necessary to dive into the exact details of how the DBpedia mappings were used in the extraction framework. In the other way around, it was necessary to fully understand how an RML mapping could produce the same results. All the DBpedia mappings eventually got their RML mapping version. Some mappings were straightforward but most of the cases were very specific and needed a custom solution. The next step was to automate the translation from the original DBpedia mapping files that are stored on GitHub to their corresponding RML version. This has also been done and was implemented in the extraction framework in the server module. Through this functionality it is now possible to access the RML version of every DBpedia mapping that is present on the running server. Second goal: importing and converting RML A first step towards integrating the executing of RML mapping documents is adding a parser that understands RML documents and converts these into a structure the extraction framework understands. To be specific, the extraction framework uses mapping data structures to store it’s loaded mappings. This parser loads the RML mapping documents and converts these to the mapping data structures. The advantage of using this method is that RML documents can be run and generate triples just as if it were using the old mapping documents. There are no big changes needed in the extraction framework itself to make this work. The drawback is that not all functionality of RML is available. Only the specific mappings designed for each DBpedia mapping can be understood and executed by this parser. For all functionality to be available, an RML processor needs to be integrated fully. An implementation of this parser was added to the extraction framework. It can read all the custom design mappings that were created. It is possible for the framework to load and run these mappings. The produced results are very good, the generated triples are the same as if the process would be run with loading the original DBpedia mappings. Optional goal: prototyping an integrated RML processor To make all functionality from RML available a real RML processor is needed. With an integrated RML processor it would be possible to test the mapping documents that were designed during the first part of the project. In the scope of this project an optional goal was to create a prototype to give an idea what is possible. There were some discussions on how this could be implemented and a solution was picked. A prototype was implemented and produced positive results. The generated triples were not all complete, but it served the purpose of a proof-of-concept implementation. The implementation proved that this workflow for integrating the processor is a possible solution if fully implemented. There was no certainty if it would be possible to create this prototype during the scope of this project. It depended on how long it would take to finalize the main goals. Luckily everything went as planned and the optional goal was completed successfully. Links Commits: (unmerged) GSoC Project #6213126861094912" "Lookup service and wiki categories" "uHello there, I tried the lookup service with this query: was expecting to get this category within the results there. What am I doing wrong? Also, from one of the categories I do get, pages under this category by using \"is dcterms:subject of\" However, that information is not included for this category Thanks, Leyla Hello there, I tried the lookup service with this query: Leyla" "Performance issues when setting up local DBpedia live" "uHi all, We tried setting up DBpedia Live on one of our local servers and keep it in sync with the official DBpedia live, using the dbpintegrator tool. Our setup is documented here: However, adding / deleting triples from / to Virtuoso has become extremely slow. I was pointed in the direction of by @kidehen and @pkleef. I changed the following parameters in Virtuoso's .ini file from their default value: [Database] MaxCheckpointRemap = 1000000 [Parameters] NumberOfBuffers = 1360000 MaxDirtyBuffers = 1000000 I can see no real difference in performance though, Virtuoso gets stuck on a DELETE query (it is busy with it for the last 125 minutes at 100% CPU :-S). As I am not at all an expert on Virtuoso tuning, all suggestions are welcome. For completeness, I have included the output of status('rhck'); in isql below. Best regards, Karel uForgot to mention the machine details: 24GB RAM 2x quadcore Xeon E5540 2.5GHz Virtuoso data is on SSD disks Regards, Karel On Fri, Sep 16, 2011 at 1:57 PM, karel braeckman < >wrote: uAnd the Virtuoso config is a single instance: OpenLink Virtuoso version 06.01.3127, on Linux (x86_64-unknown-linux-gnu), Single Edition There are currently about 300M triples in the store. I read that 500M per 16GB or RAM should still run ok, so I don't think a clustered version should be necessary? Performing Sparql queries still is very fast, but adding / deleting triples seems to be very slow. On Fri, Sep 16, 2011 at 2:26 PM, karel braeckman < > wrote: uHi Karel, First of all try these settings which we use for our DBpedia-Live virtuoso instance: NumberOfBuffers = 800000 MaxDirtyBuffers = 600000 and please don't forget to restart you Virtuoso server after changing those settings, in order for them to take effect. Secondly, in the website you sent, you have mentioned that it took two days to fill your Virtuoso instance with an initial dump of DBpedia-Live, which seems too long for me. Thirdly, my question is \"How long is Virtuoso running till now?\" uOn 9/16/11 9:41 AM, karel braeckman wrote: Thing is that via the cluster edition you get parallelism re. Insert, Update, and Delete (IUD) operations, in addition to Read oriented queries. Since we have a new version of Virtuoso at: mode to yours i.e., connecting to live.dbpedia.org we should be able to send you a patched single-server edition that would at least ensure parity re. IUD operations relative to our instance. Patrick will make this patch available to you so that we can eliminate issues that might be version related etc Kingsley uHi Karel, Your parameters look ok, but you may want to try adding the following: [Parameters] DefaultIsolation = 2 which sets a different transaction isolation level which is more suitable for situation where updates/deletes and queries are done on the same server. Did you also set your linux kernel swappiness parameter as per the following Tips and Tricks article: If not than your Linux kernel may start swapping out parts of your virtuoso process pages in favor of filesystem cache which will seriously hurt virtuoso's performance. In general you should make sure your system never starts swapping. Can you tell me the exact version of VOS you are using on your system and whether you are using the OS supplied version or if you compiled and installed it yourself. Note that the current version of Virtuoso OpenSource is 6.1.3 from: However if you are running an older version and are not afraid to do a build yourself, i would like to give you access to a prerelease of the upcoming 6.1.4 which has a number of new optimisations and fixes that maybe of benefit. Lastly on the subject of this dbpintegrator part, can you tell me the content of the file: lastDownloadDate.dat Patrick uHi guys, First of all, thanks for all the suggestions. I changed my settings according to your suggestions, and at first the problem was the same (100% CPU for quite a while when deleting triples) but after a while (~10 minutes) Virtuoso completed the action and now the tool seems to be running ok. I'm afraid I don't know which of the settings did finally got it to work. I have one more problem with the dbpintegrator tool however. I set the date in my lastDownloadDate.dat to 2011-09-10-00-000000, but the tool seems to start at the first file of the current hour (2011-09-16-16-000001), could this be a bug? @Mohamed: It really did take two days to fill the store with the DBpedia live dump. Initially it was fast, but it got slower and slower. There already was the default DBpedia dump (not the live version) inserted into another graph, maybe the amount of triples is just too large? How fast should it (more or less) take to load the live dump into Virtuoso you think? Virtuoso was running for a few weeks the first time I tried to run the sync tool. Since then, I restarted it a few times after changing config files and trying to debug things. @Patrick, @Kingsley: The version I am using is Version: 06.01.3127, Build: Mar 16 2011 of VOS. I downloaded and compiled it (on Ubuntu 10.04.2 LTS). The lastDownloadDate.dat file contains 2011-09-16-16-000555 at the moment of writing (the tool is working now). Best regards and thanks for the hints, Karel On Fri, Sep 16, 2011 at 3:53 PM, Patrick van Kleef < > wrote: uHi Karel, On 09/16/2011 04:51 PM, karel braeckman wrote: > Hi guys, > > First of all, thanks for all the suggestions. I changed my settings > according to your suggestions, and at first the problem was the same > (100% CPU for quite a while when deleting triples) but after a while > (~10 minutes) Virtuoso completed the action and now the tool seems to > be running ok. I'm afraid I don't know which of the settings did > finally got it to work. Nice to hear that :). > > I have one more problem with the dbpintegrator tool however. I set the > date in my lastDownloadDate.dat to 2011-09-10-00-000000, but the tool > seems to start at the first file of the current hour > (2011-09-16-16-000001), could this be a bug? I've tried the dbpintegrator tool on my machine starting from the point you mention and it seems to work properly, so please recheck your settings and let me know if the problem still exists. > > @Mohamed: > It really did take two days to fill the store with the DBpedia live > dump. Initially it was fast, but it got slower and slower. There > already was the default DBpedia dump (not the live version) inserted > into another graph, maybe the amount of triples is just too large? How > fast should it (more or less) take to load the live dump into Virtuoso > you think? Not 100% sure but it should take something like 3-4 hours. > Virtuoso was running for a few weeks the first time I tried to run the > sync tool. Since then, I restarted it a few times after changing > config files and trying to debug things. Exactly, this what I meant, a restart could be helpful. > > @Patrick, @Kingsley: > The version I am using is Version: 06.01.3127, Build: Mar 16 2011 of > VOS. I downloaded and compiled it (on Ubuntu 10.04.2 LTS). > > The lastDownloadDate.dat file contains 2011-09-16-16-000555 at the > moment of writing (the tool is working now). > > Best regards and thanks for the hints, > Karel > > On Fri, Sep 16, 2011 at 3:53 PM, Patrick van Kleef > < > wrote: >> Hi Karel, >> >>> Forgot to mention the machine details: >>> >>> 24GB RAM >>> 2x quadcore Xeon E5540 2.5GHz >>> Virtuoso data is on SSD disks >> >> Your parameters look ok, but you may want to try adding the following: >> >> [Parameters] >> >> DefaultIsolation = 2 >> >> >> which sets a different transaction isolation level which is more suitable >> for situation where updates/deletes and queries are done on the same server. >> >> >> Did you also set your linux kernel swappiness parameter as per the following >> Tips and Tricks article: >> >> >> >> If not than your Linux kernel may start swapping out parts of your virtuoso >> process pages in favor of filesystem cache which will seriously hurt >> virtuoso's performance. >> >> In general you should make sure your system never starts swapping. >> >> >> Can you tell me the exact version of VOS you are using on your system and >> whether you are using the OS supplied version or if you compiled and >> installed it yourself. Note that the current version of Virtuoso OpenSource >> is 6.1.3 from: >> >> >> >> However if you are running an older version and are not afraid to do a build >> yourself, i would like to give you access to a prerelease of the upcoming >> 6.1.4 which has a number of new optimisations and fixes that maybe of >> benefit. >> >> >> Lastly on the subject of this dbpintegrator part, can you tell me the >> content of the file: >> >> lastDownloadDate.dat >> >> >> Patrick >> >> >> > > uOn 9/16/11 10:51 AM, karel braeckman wrote: On lod.openlinksw.com (an 8-node cluster with 48GB per cluster node) we load DBpedia in under 30 minutes. Basically, you run the multi threaded installer across each cluster node in parallel to pull this off. If using the Single Server then it will take longer since you won't have the parallelism to exploit. uHi Karel, On 09/16/2011 04:51 PM, karel braeckman wrote: Nice to hear that :). I've tried the dbpintegrator tool on my machine starting from the point you mention and it seems to work properly, so please recheck your settings and let me know if the problem still exists Not 100% sure but it should take something like 3-4 hours. Exactly, this what I meant, a restart could be helpful. uHi Mohamed, You were right, I checked my settings and the tool does start at the correct date. I must have done something wrong earlier. Best regards, Karel On Fri, Sep 16, 2011 at 5:26 PM, Mohamed Morsey < > wrote: uHi Guys, I have let the dbpintegrator run over the weekend, but it got stuck again (same behavior as mentioned before: Virtuoso stuck at 100% CPU use). The good news is we think we found the problem. We found something strange in the update file with triples to delete where things got stuck. The file uses variables, but some of the variables are used more than once. For instance, the file 000981.removed.nt uses the ?o0 and ?o1 variables twice: ?o0 . ?o0 . ?o1 . ?o1 . This is probably the reason why the DELETE query for this file has a higher complexity and tripped Virtuoso up. We changed the code of dbpintegrator slightly to perform a Sparql query per line of this file instead of one query for the entire file. The tool is running without any problems so far. So perhaps something is going wrong in creating the files with the deletion triples? Best regards, Karel On Fri, Sep 16, 2011 at 5:53 PM, karel braeckman < > wrote: uHi Karel, On 09/19/2011 11:33 AM, karel braeckman wrote: I got the problem. This is a good solution, and I'll check the problem in the extraction framework itself and use incremental variable numbers, so the number do not coincide." "Triple counts for DBpedia chapters & Dbpedia as a whole" "uHello all, are there any more accurate triples counts available for the language versions of DBpedia than here: Is there an accurate triple count for the whole DBpedia? regards, Martin" "DBpedia-flickr Photo Wrapper" "uHi all, Christian Becker, a master student here at Freie Universität, has implemented a very nice DBpedia-flickr photo wrapper. The wrapper generates a set photos for each concept in DBpedia. It smartly combines labels in different languages and geo-coordinates in order to get precise results. Some examples: More information about the wrapper is available at The wrapper supports Linked Data and works nicely together with for example the OpenLink RDF browser and its photo features. After some additional small changes to the wrapper next week, we will interlink the generated photo-collections with DBpedia and add the links to the DBpedia dataset. What do you think about the wrapper? Any ideas for improvements? Cheers Chris uHi Chris, Hi Christian, Very nice, I like it :) My main comments would be that: 1. Right now the HTML rendering of the content doesn't do much for me that I can't get at How about showing some of the data from DBpedia (location/gmap, facts, etc) in the HTML version to demonstrate the power and ease of linked data-based mashups? 2. It would be nice to be able to address the HTML and RDF outputs separately, ie. and . With that you could hide the RDF/XML output from novice users and just provide a link. HTH, and looking forward to seeing the next version, Tom. On 08/09/2007, Chris Bizer < > wrote: uHi Tom, Thanks! I agree. I'll have to talk with Chris on how we'll proceed with this, as it grew out of a small demo for something else. I'll add this to the documentation shortly - we have a parameter called \"format\", so you can actually call rdf or xhtml Cheers, Christian uChris Bizer wrote: I think I should start a dbpedia-skeptics mailing list, since that is what I am, a skeptic. You can find photos of Brandenburg Gate and the Statue of Liberty. So what? What does that prove? I entered \"Örebro\" after the slash in the URL and I got some photos from this town in Sweden, but they are not at all typical of the town. And the heading on the page is broken UTF-8. I then entered \"Öjebro\", which is a very small place, and this tool found nothing. People who know this place will know that the thing to photograph is the old stone bridge, and Google image search immediately finds that, So could you please provide some examples that show which problem this tool tries to solve? \"It uses RDF\" does not count, since I don't know anybody who has that kind of problem. uHi Lars Being usually a skeptic myself, I think I should bite. :-) Data do not *prove* anything. It's just sitting there waiting for people to use it if they find it useful. All the point of Linking Open Data is, IMHO: We have data all over the Web in an unbound variety of more or less structured format, with more or less explicit semantics. We have a format (RDF) in which those formats can be migrated using all sorts of heuristics. The benefit of this common format is the ability to link data together in order to further navigate and query them as a single data base (using e.g., SPARQL), based on explicit and open semanticsand all that in an open format. This is work in progress, right? What does that prove? ;-) That to-date Google algorithms and data base are the winner against Linking Open Data? Sure enough. But do you know *how* Google found that? Certainly not, their algorithms are their best kept assets. On Linking Open Data, the way you get answers is perfectly clear and open. Put up a semantic query, and find the triples matching them. If there are not, there are not. You know why. You don't retrieve anything because data are not there. If you feeel like it, publish Linked Data about Öjebro, and put them in the LOD network. BTW there is already a URI for this place waiting for you to use it : So publish something like foaf:page Add the bridge to geonames data base if you feel like it, using Geonames wiki (there is a \"bridge\" category) , improve matching english Wikipedia article, publish photographs on flickr, whatever > So could you please provide some examples that show which problem The particular tool (flickr wrapper) does not \"solve\" any problem by itself. It augments the content you can access using semantic queries. The more you get of that stuff, the more interlinked it will be, and the more it will prove useful. Best Bernard uHi Lars, On 11 Sep 2007, at 02:48, Lars Aronsson wrote: Sure, please do create that list. I won't subscribe though ;-) I think the point of this project is not to allow people to find photos of things. It's up to Christian to tell us about his motivations for creating this, but on the technical side, this is a data mashup: It combines data from two different openly available datasets, namely photos from Flickr and encyclopaedia articles from Wikipedia. It seems to me like it is doing a pretty good job at finding photos related to some Wikipedia topic from the Flickr pool. It works better for some kinds of topics (e.g. sights) and worse for others (e.g. scientific concepts). Well, typicality is in the eye of the beholder. These photos are what Örebro probably looks like to Flickr community. For you, Örebro might be about that dreary old post-card panorama. For them, Örebro might be about visiting friends and hanging out. Who is to say that you are right and they are wrong? Öjebro is not in Wikipedia and thus not in the flickrwrappr. (Searching for Öjebro on Flickr yields this photo: photos/ka-ka/752430/ ) So how about starting the Wikipedia article? You cannot use these search results in your own app or site without violating Google's TOS and several people's copyright. The Flickr API is open, and photos on Flickr are annotated with license information (often Creative Commons). It solves the problem of finding associations between two public datasets, Wikipedia and Flickr. More to the point, it finds photos with known copyright status related to Wikipedia topics. And I don't know anyone who would suggest that simply using a particular technology solves any problems. Nevertheless, I'd say that RDF is a sensible choice of technology for flickrwrappr because this makes it easy to integrate flickrwrappr results with DBpedia results. Cheers, Richard uHi all, It's simply about enriching DBpedia entries with photos about them. When making a statement \"this is a picture of that concept\", one should be pretty sure about the content, hence we involve geo-coordinates. It works well for lots of sights, museums, maybe even for famous restaurants (if they're so famous that they have a geo-tagged Wikipedia article). So when you're scouting for worthwhile sights in a foreign city, just look at some pictures from the flickr wrappr and you'll get a good idea of what they are about. In cases where we don't have geo-coordinates, the results will not be as good; and as far as cities are concerned, our 1km radius will certainly make no sense, as is pointed out on the website. However it should be fairly easy to filter out these cases and come up with ways to improve results here. Thanks for the note regarding the header encoding, we'll fix it shortly. Cheers, Christian" "Invalid character in infobox-mappingbased-loose.nt when trying to load DBPedia in virtoso" "uHi, I am trying to load the new DBPedia in Virtuoso, I am getting the error below, any ideas? Marvin Error 37000: [Virtuoso Driver][Virtuoso Server]SP029: TURTLE RDF loader, line 1032914: Invalid character in INITIAL expression at < at line 0 of Top-Level: ttlp_mt (file_to_string_output ('data/infobox-mappingbased-loose.nt'), '', ' BAD FILE. data/infobox-mappingbased-loose.nt Checkpoint start at: 17:13:45 uMarvin Lugair wrote:" "Setting up dbpedia endpoint" "uHi! I would like to establish my own endpoint on my machine using all the available dbpedia dumps. Where can I find instructions on how to do this? Thanks! Aliki u0€ *†H†÷  €0€1 0 +" "XMLStreamException when querying" "uHi, I have a problem when querying all DBpedia by Jena ARQ. My query is : SELECT * WHERE {?s ?p ?o } LIMIT 10000 OFFSET x. I increment 10000 value of x in each execution iteration. I took the mentioned exception in any execution iteration for each trying and i failed. What is the reason of this exception and why did i take in different iteration? Here is the exception: WARN [main] (Log.java:73) - XMLStreamException: Unexpected EOF in prolog at [row,col {unknown-source}]: [1,0] com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog at [row,col {unknown-source}]: [1,0] at com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:686) at com.ctc.wstx.sr.BasicStreamReader.handleEOF(BasicStreamReader.java:2134) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2040) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069) at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.skipTo(XMLInputStAX.java:307) at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.skipTo(XMLInputStAX.java:299) at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.init(XMLInputStAX.java:183) at com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX. (XMLInputStAX.java:176) at com.hp.hpl.jena.sparql.resultset.XMLInputStAX.worker(XMLInputStAX.java:135) at com.hp.hpl.jena.sparql.resultset.XMLInputStAX. (XMLInputStAX.java:98) at com.hp.hpl.jena.sparql.resultset.XMLInput.make(XMLInput.java:61) at com.hp.hpl.jena.sparql.resultset.XMLInput.fromXML(XMLInput.java:30) at com.hp.hpl.jena.sparql.resultset.XMLInput.fromXML(XMLInput.java:25) at com.hp.hpl.jena.query.ResultSetFactory.fromXML(ResultSetFactory.java:278) at com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:151) uHi Same problem here, DBpedia server has problem and not recovered yet On Wed, May 2, 2012 at 3:09 PM, Ziya Akar < > wrote: uHi, Could you try again please and let me know if your problem is resolved or not. Patrick uHere resolved. Than you. On Wed, May 2, 2012 at 3:59 PM, Patrick van Kleef < >wrote: uHi The same problem that some days ago we had, is happening right now. javax.xml.stream.XMLStreamException Does the server have any problem? Thanks On Wed, May 2, 2012 at 4:02 PM, Saeedeh Shekarpour < >wrote: uHi, I will check the server. Patrick uHi Fixed, Thanks alot. Best regards On Fri, May 4, 2012 at 3:49 PM, Patrick van Kleef < >wrote:" "Wikidata and DBpedia canonicalized datasets" "uHi all, since the Wikidata project's phase I aims to remove interwiki links from wikipedia articles, I am expecting to start seeing problems with canonicalized datasets. datasets using the CanonicalizeUris script. How are you going to handle this situation? Are you going to retrieve interwiki links from Wikidata (with some APIs - since there is no official dataset ready yet)? Cheers Andrea Hi all, since the Wikidata project's phase I aims to remove interwiki links from wikipedia articles, I am expecting to start seeing problems with canonicalized datasets. From what I can see, DBpedia relies on such links to produces *_en_uris datasets using the CanonicalizeUris script. How are you going to handle this situation? Are you going to retrieve interwiki links from Wikidata (with some APIs - since there is no official dataset ready yet)? Cheers Andrea uHi Andrea, Wikidata is yet another mediawiki ;) what we plan to do (actually already started) is to treat it as an extra source. Wikidata already offers dumps so we can use it exactly like any other wikipedia language edition. of course we'll have to create a new extractor and adapt a few scripts but, other than I don't see any other big changes for now Best, Dimitris On Fri, Mar 15, 2013 at 12:38 PM, Andrea Di Menna < > wrote: uHi Dimitris, thanks for your answers. Have you already committed any code to git for the extractor? Also, are wikidata dumps those available here: I did not see any reference to this in the Wikidata page, and some info in the Italian Wikidata page got me confused :P Thanks Andrea 2013/3/15 Dimitris Kontokostas < > uHi all, there will also be RDF dumps from Wikidata soon, containing all the site links plus all the data and references. I would love to see DBpedia providing a view of the preferred statements from Wikidata btwWikidata will provide all kinds of statements for an entity: deprecated or outdated statements or also historical data. A cleaned view on this amount of data would be awesome and is definitely needed. I remember us talking about this at MLODE. Cheers, Anja On Mar 15, 2013, at 12:06, Dimitris Kontokostas < > wrote: u0€ *†H†÷  €0€1 0 +" "Querying the Arabic DBpedia" "uDear all; I am trying to query the Arabic chapter of DBpedia. My query includes some Arabic text, which I believe is the reason behind the failure of my query. Query in Java: Literal literal = ResourceFactory.createLangLiteral(\"نيسابور\", \"ar\"); Property property = ResourceFactory.createProperty(\" \"label\"); ParameterizedSparqlString queryString = new ParameterizedSparqlString(\"\" + \"PREFIX pro: \n\" + \"SELECT DISTINCT ?book ?label WHERE {\n\" + \"?book ?property ?label .\n\" + \"} LIMIT 100\"); System.out.println(queryString); queryString.setParam(\"label\", literal); queryString.setParam(\"property\", property); QueryExecution qexec = QueryExecutionFactory.sparqlService(\" queryString.asQuery()); The last line of this code throws an error: Exception in thread \"main\" org.apache.jena.query.QueryParseException: Encountered \" \"\\"\u0646\u064a\u0633\u0627\u0628\u0648\u0631\\" \"\" at line 4, column 23. Was expecting one of: \"from\" \"where\" \"(\" \"{\" at org.apache.jena.sparql.lang.ParserSPARQL11.perform(ParserSPARQL11.java:101) I tried to run the query using the sparql web interface: Query: PREFIX dbpar: SELECT DISTINCT ?book WHERE { ?book dbpar:label \"نيسابور\"} LIMIT 100 I tried setting the \"Default Data Set Name (Graph IRI)\" as, both, \" It runs successfully, but not results where returned, which should be because there is a result in [1]. Please is there a problem with my query? or am I missing something? Thank you. [1] * uHey again )) Nevermind, I figured out the issue, it was the ?label in this line: \"SELECT DISTINCT ?book ?label WHERE {\n\" I just removed it, and it's working now. Thank you. On 5 May 2016 at 12:49, Ahmed Ktob < > wrote:" "Large amounts of missing georss:point data" "uDear all, I've only recently started working with dbpedia as a research resource, so I apologize in advance if I'm asking a silly question with an obvious answer. Dbpedia is a wonderful resource, a huge thanks to all of you on the list who are putting hard work into it. It seems to me that there was a recent change to dbpedia that caused a good chunk of information I've been working with to go missing. I've been working with the geographical locations of military conflicts over the past few months, stitching together a dataset from dbpedia triples that linked several thousand battles to their locations via the georss:point predicate. I started rebuilding this dataset this week and it seems that much of that information I saw before is missing. To illustrate, the following query to the SPARQL endpoint at select COUNT(?battle) as ?count where { ?battle rdf:type dbo:MilitaryConflict . ?battle georss:point ?location . } Just to make sure I wasn't going crazy I decided to do essentially the same query to the dbpedia live SPARQL endpoint ( select COUNT(?battle) as ?count where { ?battle rdf:type dbpedia-owl:MilitaryConflict . ?battle georss:point ?location . } This returns 4225 hits, which is more reminiscent of the amount of results I got working with non-live dbpedia a few months ago. So, a few months ago, vanilla dbpedia gave me several thousand results, and dbpedia live right now does the same, but as of recently dbpedia vanilla only returns 11 results. Any reason that this may be the case? Perhaps I'm missing something quite simple? I appreciate any feed back. Thank you and best regards, Vincent Malic Ph.D Student in Information Science School of Informatics and Computing Indiana University, Bloomington Dear all, I've only recently started working with dbpedia as a research resource, so I apologize in advance if I'm asking a silly question with an obvious answer. Dbpedia is a wonderful resource, a huge thanks to all of you on the list who are putting hard work into it. It seems to me that there was a recent change to dbpedia that caused a good chunk of information I've been working with to go missing. I've been working with the geographical locations of military conflicts over the past few months, stitching together a dataset from dbpedia triples that linked several thousand battles to their locations via the georss:point predicate. I started rebuilding this dataset this week and it seems that much of that information I saw before is missing. To illustrate, the following query to the SPARQL endpoint at Bloomington uThank you for the report Vincent, The online data are based on the new to-be-announced release 2015-04 What you found is indeed quite strange as there were no changes in - - - and from a sample of pages with missing coords I checked there were no substantial changes On the other hand, running a sample extraction of a page returns correct coordinates which complies with the DBpedia Live data We will look further into this Best, Dimitris On Mon, Aug 3, 2015 at 10:08 PM, Vincent Malic < > wrote: uDear Dimitris, Thanks for the info, I really appreciate you looking into this. I've been working with the MilitaryConflict data so that's the window through which I noticed the missing triples, but since you mentioned there weren't any changes to the conflict mapping I poked around a little more and I'm lead to believe this isn't an issue with dbpedia extracting conflict data and may have to do with geographic coordinates in general. As way of illustration I took a look at the geographic coordinates dataset from the 3.9 downloads, which has 1,987,961 triples, while the corresponding geographic dataset on the 4.0 downloads page has only 2284 triples, which seems like a significant drop for geographic data across the whole dbpedia dataset. Perhaps there were some changes made to the extraction framework for the general extraction of the Wikipedia \"coord\" tag? Hope this information helps. Once again, thanks for looking into this issue. Best, Vincent On Wed, Aug 5, 2015 at 3:25 AM Dimitris Kontokostas < > wrote: uHello Vincent, I ran the GeoExtractor for the English dump files again, as a first test. This resulted in the same geo-coordinates files as produced by the main extraction. The problem seems not to be near the surface. We will have a closer look at this in the coming weeks. Regards, Markus 2015-08-05 16:47 GMT+02:00 Vincent Malic < >: uHello Vincent, We identified the problem and added back a lot of missing coordinates. The problem will be 100% fixed in the next release which is due soon. We apologize for any inconvenience. Cheers, Dimitris Sent from my mobile, please excuse my brevity. Hello Vincent, I ran the GeoExtractor for the English dump files again, as a first test. This resulted in the same geo-coordinates files as produced by the main extraction. The problem seems not to be near the surface. We will have a closer look at this in the coming weeks. Regards, Markus 2015-08-05 16:47 GMT+02:00 Vincent Malic < >:" "Template mappings and ignore list" "uHi all, what is the process to insert templates into the ignore list? I am looking at Italian templates mapping and there are some which should be added, e.g. \"Nota disambigua\" which is for disambiguations, or \"Cassetto\" which just produces expandable boxes inside an article. Please let me know. Regards Andrea Hi all, what is the process to insert templates into the ignore list? I am looking at Italian templates mapping and there are some which should be added, e.g. 'Nota disambigua' which is for disambiguations, or 'Cassetto' which just produces expandable boxes inside an article. Please let me know. Regards Andrea uHi Andrea, On 6/13/13 11:53 AM, Andrea Di Menna wrote: These could just be left as they are now, since they are not mapped at all, and wouldn't affect the extraction. As a guideline for Italian, we should concentrate our mapping efforts on the so called \"Template sinottici\", which are the actual infoboxes [1]. Feel free to start brand new template mappings that appear in red in the stats page and are listed in [1]. The Italian community would highly appreciate this. Cheers! [1] >" "DBpedia SPARQL error when applying JENA" "uHi everyone, I am trying to retrieve some data from DBpedia by means of jena library. But when I apply my code as below I encounter an error as stated below. I could not be sure if here is the right environment to place my question but I could not find any help through web. Thanks in advance. Mehmet My CODE: package connectingurl; import com.hp.hpl.jena.query.*; public class DBpediaQuery { public static void main( String[] args ) { String s2 = \"PREFIX yago: \n\" + \"PREFIX onto: \n\" + \"PREFIX rdf: \n\" + \"PREFIX dbpedia: \n\" + \"PREFIX owl: \n\" + \"PREFIX dbpedia-owl: \n\" + \"PREFIX rdfs: \n\" + \"PREFIX dbpprop: \n\" + \"PREFIX foaf: \n\" + \"SELECT DISTINCT *\n\" + \"WHERE {\n\" + \"?city rdf:type dbpedia-owl:PopulatedPlace .\n\" + \"?city rdfs:label ?label.\n\" + \"?city dbpedia-owl:country ?country .\n\" + \"?country dbpprop:commonName ?country_name.\n\" + \"OPTIONAL { ?city foaf:isPrimaryTopicOf ?web }.\n\" + \"FILTER ( lang(?label) = \\"en\\" && regex(?country, 'Germany') && regex(?label, 'Homburg')) \n\" + \"} \n\" + \"\"; Query query = QueryFactory.create(s2); //s2 = the query above QueryExecution qExe = QueryExecutionFactory.sparqlService( \" //QueryExecution qExe = QueryExecutionFactory.create( query ); ResultSet results = qExe.execSelect(); ResultSetFormatter.out(System.out, results, query) ; } } The ERROR: Exception in thread \"main\" org.apache.http.conn.ssl.SSLInitializationException: Failure initializing default system SSL context at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368) at org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204) at org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82) at org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118) at org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466) at org.apache.http.impl.client.AbstractHttpClient.createHttpContext(AbstractHttpClient.java:286) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:851) at org.apache.http.impl.client.DecompressingHttpClient.execute(DecompressingHttpClient.java:137) at org.apache.http.impl.client.DecompressingHttpClient.execute(DecompressingHttpClient.java:118) at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1043) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:320) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:382) at com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:326) at com.hp.hpl.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:276) at com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:345) at connectingurl.DBpediaQuery.main(DBpediaQuery.java:51) Caused by: java.io.IOException: Keystore was tampered with, or password was incorrect at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:772) at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:55) at java.security.KeyStore.load(KeyStore.java:1214) at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:281) at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:366) 15 more Caused by: java.security.UnrecoverableKeyException: Password verification failed at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:770) 19 more Process exited with exit code 1. uHi Mehmet, I think you used the wrong URL for the sparql endpoint, you should have used \" . Try changing that in your code and tell us if you still get errors. Cheers, Alexandru On 03/05/2014 12:12 PM, Mehmet Ali Abdulhayoglu wrote: uHi Alexandru, Yes, indeed I tried the sparql endpoint but the result was same. In my previous message I forgot to mention. Here are the warning messages that can help: log4j:WARN No appenders could be found for logger (org.apache.jena.riot.stream.JenaIOEnvironment). log4j:WARN Please initialize the log4j system properly. log4j:WARN See Thanks for your consideration. Best, Mehmet From: Ranma Saotome [mailto: ] On Behalf Of Alexandru Todor Sent: Wednesday 5 March 2014 1:18 PM To: Mehmet Ali Abdulhayoglu; Subject: Re: [Dbpedia-discussion] DBpedia SPARQL error when applying JENA Hi Mehmet, I think you used the wrong URL for the sparql endpoint, you should have used \" Cheers, Alexandru On 03/05/2014 12:12 PM, Mehmet Ali Abdulhayoglu wrote: Hi everyone, I am trying to retrieve some data from DBpedia by means of jena library. But when I apply my code as below I encounter an error as stated below. I could not be sure if here is the right environment to place my question but I could not find any help through web. Thanks in advance. Mehmet My CODE: package connectingurl; import com.hp.hpl.jena.query.*; public class DBpediaQuery { public static void main( String[] args ) { String s2 = \"PREFIX yago: \n\" + \"PREFIX onto: \n\" + \"PREFIX rdf: \n\" + \"PREFIX dbpedia: \n\" + \"PREFIX owl: \n\" + \"PREFIX dbpedia-owl: \n\" + \"PREFIX rdfs: \n\" + \"PREFIX dbpprop: \n\" + \"PREFIX foaf: \n\" + \"SELECT DISTINCT *\n\" + \"WHERE {\n\" + \"?city rdf:type dbpedia-owl:PopulatedPlace .\n\" + \"?city rdfs:label ?label.\n\" + \"?city dbpedia-owl:country ?country .\n\" + \"?country dbpprop:commonName ?country_name.\n\" + \"OPTIONAL { ?city foaf:isPrimaryTopicOf ?web }.\n\" + \"FILTER ( lang(?label) = \\"en\\" && regex(?country, 'Germany') && regex(?label, 'Homburg')) \n\" + \"} \n\" + \"\"; Query query = QueryFactory.create(s2); //s2 = the query above QueryExecution qExe = QueryExecutionFactory.sparqlService( \" //QueryExecution qExe = QueryExecutionFactory.create( query ); ResultSet results = qExe.execSelect(); ResultSetFormatter.out(System.out, results, query) ; } } The ERROR: Exception in thread \"main\" org.apache.http.conn.ssl.SSLInitializationException: Failure initializing default system SSL context at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368) at org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204) at org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82) at org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118) at org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466) at org.apache.http.impl.client.AbstractHttpClient.createHttpContext(AbstractHttpClient.java:286) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:851) at org.apache.http.impl.client.DecompressingHttpClient.execute(DecompressingHttpClient.java:137) at org.apache.http.impl.client.DecompressingHttpClient.execute(DecompressingHttpClient.java:118) at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1043) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:320) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:382) at com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:326) at com.hp.hpl.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:276) at com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:345) at connectingurl.DBpediaQuery.main(DBpediaQuery.java:51) Caused by: java.io.IOException: Keystore was tampered with, or password was incorrect at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:772) at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:55) at java.security.KeyStore.load(KeyStore.java:1214) at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:281) at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:366) 15 more Caused by: java.security.UnrecoverableKeyException: Password verification failed at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:770) 19 more Process exited with exit code 1. uHi again, I have just realized that when I go to the link on the screen it writes \"(Security restrictions of this server do not allow you to retrieve remote RDF data, see details .) \" So my problem is related with this. In the details I have tried to access But I got 403 forbidden message. Best, Mehmet From: Mehmet Ali Abdulhayoglu Sent: Wednesday 5 March 2014 1:21 PM To: ' '; Subject: RE: [Dbpedia-discussion] DBpedia SPARQL error when applying JENA Hi Alexandru, Yes, indeed I tried the sparql endpoint but the result was same. In my previous message I forgot to mention. Here are the warning messages that can help: log4j:WARN No appenders could be found for logger (org.apache.jena.riot.stream.JenaIOEnvironment). log4j:WARN Please initialize the log4j system properly. log4j:WARN See Thanks for your consideration. Best, Mehmet From: Ranma Saotome [mailto: ] On Behalf Of Alexandru Todor Sent: Wednesday 5 March 2014 1:18 PM To: Mehmet Ali Abdulhayoglu; Subject: Re: [Dbpedia-discussion] DBpedia SPARQL error when applying JENA Hi Mehmet, I think you used the wrong URL for the sparql endpoint, you should have used \" Cheers, Alexandru On 03/05/2014 12:12 PM, Mehmet Ali Abdulhayoglu wrote: Hi everyone, I am trying to retrieve some data from DBpedia by means of jena library. But when I apply my code as below I encounter an error as stated below. I could not be sure if here is the right environment to place my question but I could not find any help through web. Thanks in advance. Mehmet My CODE: package connectingurl; import com.hp.hpl.jena.query.*; public class DBpediaQuery { public static void main( String[] args ) { String s2 = \"PREFIX yago: \n\" + \"PREFIX onto: \n\" + \"PREFIX rdf: \n\" + \"PREFIX dbpedia: \n\" + \"PREFIX owl: \n\" + \"PREFIX dbpedia-owl: \n\" + \"PREFIX rdfs: \n\" + \"PREFIX dbpprop: \n\" + \"PREFIX foaf: \n\" + \"SELECT DISTINCT *\n\" + \"WHERE {\n\" + \"?city rdf:type dbpedia-owl:PopulatedPlace .\n\" + \"?city rdfs:label ?label.\n\" + \"?city dbpedia-owl:country ?country .\n\" + \"?country dbpprop:commonName ?country_name.\n\" + \"OPTIONAL { ?city foaf:isPrimaryTopicOf ?web }.\n\" + \"FILTER ( lang(?label) = \\"en\\" && regex(?country, 'Germany') && regex(?label, 'Homburg')) \n\" + \"} \n\" + \"\"; Query query = QueryFactory.create(s2); //s2 = the query above QueryExecution qExe = QueryExecutionFactory.sparqlService( \" //QueryExecution qExe = QueryExecutionFactory.create( query ); ResultSet results = qExe.execSelect(); ResultSetFormatter.out(System.out, results, query) ; } } The ERROR: Exception in thread \"main\" org.apache.http.conn.ssl.SSLInitializationException: Failure initializing default system SSL context at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368) at org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204) at org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82) at org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118) at org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466) at org.apache.http.impl.client.AbstractHttpClient.createHttpContext(AbstractHttpClient.java:286) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:851) at org.apache.http.impl.client.DecompressingHttpClient.execute(DecompressingHttpClient.java:137) at org.apache.http.impl.client.DecompressingHttpClient.execute(DecompressingHttpClient.java:118) at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1043) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:320) at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:382) at com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:326) at com.hp.hpl.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:276) at com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:345) at connectingurl.DBpediaQuery.main(DBpediaQuery.java:51) Caused by: java.io.IOException: Keystore was tampered with, or password was incorrect at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:772) at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:55) at java.security.KeyStore.load(KeyStore.java:1214) at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:281) at org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:366) 15 more Caused by: java.security.UnrecoverableKeyException: Password verification failed at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:770) 19 more Process exited with exit code 1." "Specifying my own XSLT stylesheet to use with DBpedia SNORQL results?" "uI put an XSLT stylesheet at copy of the standard xml-to-html.xsl one with a few arbitrary changes liked \"flag1\" showing up at the beginning of each td element.) On Results field and enter stylesheet URL\" field on the form, the results do not use that stylesheet. Can anyone tell me why? thanks, Bob uOn 10/1/11 3:42 PM, Bob DuCharme wrote: What happens with a raw SPARQL protocol URL, using the DBpedia endpoint? &xslt-uri; parameter takes the URI of and xslt resource . Kingsley uHi Kingsley, >What happens with a raw SPARQL protocol URL, using the DBpedia endpoint? >&xslt-uri; parameter takes the URI of and xslt resource . Virtuoso doesn't like it: curl \" Virtuoso 22023 Error The XSL-T transformation is prohibited Without that parameter, this works: curl \" thanks, Bob On 10/1/2011 6:37 PM, Kingsley Idehen wrote: u0€ *†H†÷  €0€1 0 + upublic as XSLT transformation is really powerful. Tell me about it" "How many nodes in the Virtuoso cluster?" "uI am curious if DBPedia uses Virtuoso as one big high performance machine, or a cluster of Virtuoso nodes. If a cluster, how many nodes are used to interact with the data via DBPedia official SPARQL endpoint? I am curious if DBPedia uses Virtuoso as one big high performance machine, or a cluster of Virtuoso nodes. If a cluster, how many nodes are used to interact with the data via DBPedia official SPARQL endpoint? uOn 2/6/14 5:10 PM, Kristian Alexander wrote: The information about any Virtuoso instance is always printed in the footer page of instances such as DBpedia, DBpedia-Live (our hosted edition), LOD Cloud Cache etc Links: [1]" "Obtaining confidence values associated with Dbpedia facts" "uDear All, I am Jyoti, PhD student at IIIT-Delhi India. For one of my research projects I need the confidence values associated with facts of DBpedia. Is it possible to get DBpedia dataset containing these values. If not, can you please share the rules or procedure by which confidence values are assigned to facts in DBpedia? Thanks & Regards, Jyoti uDear Jyoti, DBpedia facts do not come with confidence values, and to the best of my knowledge there are no rules or procedures implemented in the framework to assign such values. Cheers, Volha On 11/18/2014 6:36 AM, Jyoti Leeka wrote:" "GSOC 2016 Automatic Mappings Extraction & Upgrade Sweble Parser" "u*Mentor: Dimitris KontokostasStudent: Aditya Nambiar* GSoC Project: Automatic Mappings Extraction & Upgrade Sweble Parser Link: project/6253053984374784/details/ TL;DR;: The task of this project was to create extractors that identify Wikidata annotations in Wikipedia articles and Wikipedia templates and transform them to DBpedia mappings. We not only managed to complete this task but also upgraded the DBpedia parser to enable the parsing of more complex / nested templates. Click here to view this report nicely formatted: JsoDSahzYECwLk/ Here’s a long version DBpedia currently maintains a mapping between Wikipedia info-box properties to the DBpedia ontology, since several similar templates exist to describe the same type of info-boxes. The aim of the project is to enrich the existing mapping and possibly correct the incorrect mapping using Wikidata. *Extracting Article Wikidata annotations* Wikipedia provides parser functions that can fetch values from wikidata and display them directly in a wikipedia article [ link] . For example in an article, we can find the following: {{ Infobox Test1 | area_total_km2 = 54.84 | population_as_of = {{#invoke:Wikidata|getQualifierDateValue|P1082| P585|FETCH_WIKIDATA|dmy}} | population_note = | population_total = {{#property:P1082}} }} We extract this information, and generate: 1. (\"Infobox Test1\",\"population_as_of\",\"P1082/P585\"), 2. (\"Infobox Test1\",\"population_total\",\"P1082\") At the end we evaluate all the triples Link to the extractor - link *Extracting Template Wikidata annotations* Sometimes the wikidata annotations are embedded directly in a wikipedia template. In those case we assume that the mapping is direct. For example in the infobox template, we can find the following Inside page “Infobox Test1” (the infobox definition) | data37 = {{#if:{{{website|}}} |{{#ifeq:{{{website|}}}|hide||{{{website|}}} }} |{{#if:{{#property:P856}} |{{URL|{{#property:P856}}}} }} }} | established_date = {{#if: {{{established_date|}}} | {{{established_date}}} | {{#invoke:Wikidata|property|P765}} }} }} And we extract this information and generate 1. (\"Infobox Test1\",\"website\",\"P856\") 2. (\"Infobox Test1\",\"hide\",\"P856\") 3. (\"Infobox Test1\",\"established_date\",\"P765\") 4. (\"Infobox Test1\",\"URL\",\"P856\") Annotations in templates are considered more credible and can be applied directly while annotation in articles need some extra post processing to identify possible outliers. (left as a follow up work) Link to the extractor - link *Advanced WikiText parsing* To do the above extractions, we made extensive use of the AST generated by the simple parser. However in several cases the simple parser failed to create a correct AST, especially when it has nested template parameters like for eg :- try2 = {{#if: abc |{{#ifeq:{{{website|}}}|hide||{{{website|}}} }} | pqrs }} The simple parser would create text nodes with text = \"}|hide||\", which makes no sense. The earlier Simple Parser failed at parsing ParserFunctionNodes which was important for the first phase of the project. The Sweble Parser solves the problem. *Upgrade Sweble Parser* As mentioned above to deal with the cases where the simple parser fails we decided to upgrade the existing sweble parser to V2.1 *Work Done* Successfully upgraded the parser and added several additional functionality to the sweble parser which the earlier Sweble Wrapper did not do such as XmlElements, ImageLinks etc We then created parameterized unit tests to help developers know where the two parser create similar AST’s and where they differ by overriding the “equals to” operator in each of the children classes of the Node class . Parameterized unit tests also make it very easy to add new test cases. We also tested the 2 parsers across several diverse wikipedia pages from abstract things like Renaissance to, books and famous people like Adolf hitler, monuments etc. Project code / commits master?author=aditya-nambiar Click here to view this report nicely formatted: JsoDSahzYECwLk/ Mentor: Dimitris Kontokostas Student: Aditya Nambiar   GSoC Project: Automatic Mappings Extraction & Upgrade Sweble Parser Link: u*Mentor: Dimitris KontokostasStudent: Aditya Nambiar* GSoC Project: Automatic Mappings Extraction & Upgrade Sweble Parser Link: 6253053984374784/details/ TL;DR;: The task of this project was to create extractors that identify Wikidata annotations in Wikipedia articles and Wikipedia templates and transform them to DBpedia mappings. We not only managed to complete this task but also upgraded the DBpedia parser to enable the parsing of more complex / nested templates. Click here to view this report nicely formatted. Here’s a long version DBpedia currently maintains a mapping between Wikipedia info-box properties to the DBpedia ontology, since several similar templates exist to describe the same type of info-boxes. The aim of the project is to enrich the existing mapping and possibly correct the incorrect mapping using Wikidata. *Extracting Article Wikidata annotations* Wikipedia provides parser functions that can fetch values from wikidata and display them directly in a wikipedia article [ link] . For example in an article, we can find the following: {{ Infobox Test1 | area_total_km2 = 54.84 | population_as_of = {{#invoke:Wikidata|getQualifie rDateValue|P1082|P585|FETCH_WIKIDATA|dmy}} | population_note = | population_total = {{#property:P1082}} }} We extract this information, and generate: 1. (\"Infobox Test1\",\"population_as_of\",\"P1082/P585\"), 2. (\"Infobox Test1\",\"population_total\",\"P1082\") At the end we evaluate all the triples Link to the extractor - link *Extracting Template Wikidata annotations* Sometimes the wikidata annotations are embedded directly in a wikipedia template. In those case we assume that the mapping is direct. For example in the infobox template, we can find the following Inside page “Infobox Test1” (the infobox definition) | data37 = {{#if:{{{website|}}} |{{#ifeq:{{{website|}}}|hide||{{{website|}}} }} |{{#if:{{#property:P856}} |{{URL|{{#property:P856}}}} }} }} | established_date = {{#if: {{{established_date|}}} | {{{established_date}}} | {{#invoke:Wikidata|property|P765}} }} }} And we extract this information and generate 1. (\"Infobox Test1\",\"website\",\"P856\") 2. (\"Infobox Test1\",\"hide\",\"P856\") 3. (\"Infobox Test1\",\"established_date\",\"P765\") 4. (\"Infobox Test1\",\"URL\",\"P856\") Annotations in templates are considered more credible and can be applied directly while annotation in articles need some extra post processing to identify possible outliers. (left as a follow up work) Link to the extractor - link *Advanced WikiText parsing* To do the above extractions, we made extensive use of the AST generated by the simple parser. However in several cases the simple parser failed to create a correct AST, especially when it has nested template parameters like for eg :- try2 = {{#if: abc |{{#ifeq:{{{website|}}}|hide||{{{website|}}} }} | pqrs }} The simple parser would create text nodes with text = \"}|hide||\", which makes no sense. The earlier Simple Parser failed at parsing ParserFunctionNodes which was important for the first phase of the project. The Sweble Parser solves the problem. *Upgrade Sweble Parser* As mentioned above to deal with the cases where the simple parser fails we decided to upgrade the existing sweble parser to V2.1 *Work Done* Successfully upgraded the parser and added several additional functionality to the sweble parser which the earlier Sweble Wrapper did not do such as XmlElements, ImageLinks etc We then created parameterized unit tests to help developers know where the two parser create similar AST’s and where they differ by overriding the “equals to” operator in each of the children classes of the Node class . Parameterized unit tests also make it very easy to add new test cases. We also tested the 2 parsers across several diverse wikipedia pages from abstract things like Renaissance to, books and famous people like Adolf hitler, monuments etc. Project code / commitsAll Commits to Master branch - link Pull Request Click here to view this report nicely formatted. Mentor: Dimitris Kontokostas Student: Aditya Nambiar   GSoC Project: Automatic Mappings Extraction & Upgrade Sweble Parser Link: formatted." "Deadline approaching for Wikimania 2016" "uHi folks, The deadline for submitting a main track proposal to Wikimania 2016 is January 7th: Please note that it concerns the main track, i.e., big competition and Wikimedia-specific critical issues. Based on a recent chat with a Wikimedia folk, there should also be a poster session with many more slots, which in my opinion best fit a DBpedia submission. I have not seen the page discussing this, but there should be one. Cheers," "Distributed DBpedia Extraction (Open Beta)" "uDear all, We are happy to announce an early beta version of Distributed DBpedia Extraction with Hadoop / Spark. Things are still rough but we want beta testers to report their experience - and extraction time of course. :) Read ahead if you are interested Right now we only support extraction, which means that you need to download the dumps with the existing method (distributed downloading is our next step) Setting up the framework and performing a distributed extraction is fairly easy; we have outlined all the details in the README and added a script for firing up a Spark+HDFS cluster quickly on Google Compute Engine. For a single language, the whole extraction job (including redirects) is executed in parallel. If you add multiple languages, all jobs are submitted to Spark, and based upon Spark’s configured scheduling mode, they’ll be scheduled over the cluster in parallel either in a FIFO (default) or FAIR manner. We did some tests on a small 3-node cluster: 1 master (2 core 7.5G RAM - GCE n1-standard-2), 2 slaves (4 core 15G RAM each - GCE n1-standard-4) with 4 workers on each slave. Using the English Wikipedia, the distributed framework took a total of 3hrs. 21 min. to finish extraction (including the pre-extraction redirects computation). We’ll add more tests and benchmarks to the GitHub wiki pages very soon. Any feedback is more than welcome. We keep track of our future tasks and bugs @GitHub Cheers, Nilesh, Sang & Dimitris Acknowledgements: This project is sponsored by the Google Summer of Code project. You can also email me at or visit my website Dear all, We are happy to announce an early beta version of Distributed DBpedia Extraction with Hadoop / Spark. Things are still rough but we want beta testers to report their experience - and extraction time of course. :) website" "yago categories in German dbpedia" "uHi *, (I was redirected from the German dbpedia list[5] to this mailing list). (Btw, sorry for not having the time to further examine the paper which explains yago and dbpedia[1]. I am just pointing to things showing on the surface.) A German wikipedia resource[0] describes a novel and thus is categorizied as literature. Now, their is a the link to \"other languages\", and these resources describe the movie based on that novel. (It's wrong to link theses resources with owl:sameAs, but of course you can only take what's there.) The german dbpedia entry is rdf:typed using a subset of rdf:type clearly coming from the international dbpedia[3]. Now, what makes me curious is: 1. why is for the German dbpedia entry[4] only taken a subset of rdf:type of the English one[3] ? 2. why is no rdf:type reflecting that the resource[4] is (at least) also a novel? (I guess being typed as literature and a movie at the same time is semantically a contradiction, but then it would be fine to mark that contradiction in the dataset.) -o [0] [1] [3] [4] [5] forum.php?thread_name=4FB3E30F.6080503%40gmail.com&forum;_name=dbpedia-germany uHi Pascal, The reason I've asked you to forward the discussion here is because the problem seems to be a bit more general than just limited to the DBpedia German. First of all thanks to your post I noticed that our interlinking script had some bugs in it, forwarded that issue to Dimitris who discovered even more bugs, so now we should have a better Yago categorization in the country chapters. As an answer to your specific questions Because each internationalized DBpedia doesn't directly map yago classes to specific resource, I don't even think we could since Yago is in English(however I have no idea, maybe it would work by defining custom extraction rules and them mapping them to the corresponding English Ontology classes). We use the DBpedia Internationalization Interlinking shell script [1] which goes over the interlanguage links and and copies the Yago categorization from the matching DBpedia.org resource. Because Wikipedia contains incorrect data, as in many other cases. If you look at [2] you will see that the Wikipedia article describes the movie not the novel, however it is linked to an article in the German Wikipedia that describes the novel not the movie [3]. The previous issue results in a mismatch in the Yago categorisation in the german DBpedia endpoint, and can only be remediated by correcting the language links in the English Wikipedia. [1] [2] [3] On 05/18/2012 09:44 AM, Pascal Christoph wrote:" "IP blacklist, DBpedia dump 3.7 and Openlinksw Sparql endpoint" "uHi all, I think one of our IP addresses has been blacklistened by dbpedia servers. We use these addresses just for research purposes within my university. Who should I contact for kindly asking to enable it again? Ok, the obvious answer to this question could be: \"install a local dump of DBpedia and don't bother DBpedia server\". Well, that's what I would really like to do. We used to have dbpedia dumps 3.5. Then, we recently decided to install a brand new fresh version with dump 3.7. The nightmare started. :-) Here's my story. I successfully installed Virtuoso Opensource 6.1.4 (latest version) on a Linux Ubuntu 10.04 64bit distribution with 32GB ram. Then, I tried several times to follow the instructions at: (I successfully did the same a couple of years ago for dump 3.5). Unfortunately, after one hour or two of correct execution, the rdf_loader_run() procedure stucks, the virtuoso-t process result active (at least it seems to be active since a \"ps aux|grep virtuoso\" shows me the process has not been killed), but everything concerning virtuoso seems to be dead: the web interface respond anymore, a \"top\" command from the shell does not show \"virtuoso\" (while in the beginning it used 100% of CPU), the \"isql-v\" command allows me to correctly log in, but then the instructions does not respond. The virtuoso.log file does not show anything wrong. Finally, I've observed (from the virtuoso.log file) some of the .nt files of the dump contain incorrect triples. For example, I get this error message: File /dbpedia-dump/3.7/en/external_links_en.nt error 23000 SR133: Can not set NULL to not nullable column 'DB.DBA.RDF_QUAD.O' The problem is that when such an error is encountered, I think the loading of that file does not go on. In other words, I could lose important triples. Does anyone has any successfully/unsuccessfully experiences about installing DBpedia dump 3.7 on Virtuoso? ps: where is the openlinksw beloved endpoint? Thanks in advance, roberto uRoberto, I think is very useful. We (well, Sarven, in CC) have done it for the DBpedia mirror Ireland [1] with the following spec: System: Ubuntu x86_64 GNU/Linux Memory: 16GB Disk: 2TB (swap: 16 GB) Filesystem: ext4 Server: Apache/2.2.14 Datadump: Source: Modified: 2011-09-13 Size on disk: 354GB Number of files: 17694 HTTP GET time length: ~3 days (~65.5 hours for raw files at 1.5MB/s + overhead) Replication: Requirements: wget HTTP GET command: wget -vcNr -w5 -np -nH HTH Cheers, Michael [1] uHi Roberto, Have you tuned your Virtuoso Server for hosting large datasets as indicated in the Bulk loader document, which directs you to : as it source as if the server is under resourced ? The LOD Cloud Cache (lod.openlinksw.com) is currently down for major Virtuoso Server update and load of 50+ billion triples, and was announced on twitter etc. It should be back online soon Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // 10 Burlington Mall Road, Suite 265, Burlington MA 01803 Weblog uMichael, Il 29/02/2012 13:36, Michael Hausenblas ha scritto: thank for you reply. I basically followed the same guidelines for the import. Actually the BulkLoadingScript ( is not necessary for Virtuoso after 6.1.3 (it's written here: first item list). I've read in the comments to that blog post that several people experienced some problems with dump 3.7. Concerning the encoding, I've read somewhere that Virtuoso 6.1.4 should solve such problems. That's the reason why I installed Virtuoso 6.1.4. But this didn't avoid the import to stuck. Did you (Sarven, anyone else) load dump 3.7 on Virtuoso? cheers, roberto uHi Hugh, Il 29/02/2012 13:51, Hugh Williams ha scritto: thank you very much for your answer. Since our server has 32GB ram, I set these values for buffers: ;; Uncomment next two lines if there is 16 GB system memory free NumberOfBuffers = 1360000 MaxDirtyBuffers = 1000000 Then, [Database] Striping = 0 [Parameters] ServerThreads = 100 DirsAllowed = path-to-the-dump [HTTPServer] ServerThreads = 100 [SPARQL] ResultSetMaxRows = 120000 MaxQueryCostEstimationTime = 60000 ; in seconds MaxQueryExecutionTime = 600 ; in seconds The rest is was not modified. It seems to be everything correctly tuned. :-( regards, roberto u0€ *†H†÷  €0€1 0 + uHi Roberto, You can send the IP address to me privately and i will check the server to see what is going on. Patrick uHugh, Il 29/02/2012 14:59, Hugh Williams ha scritto: It's exactly as you said. I'll try to modify this value and see what happens. Thanks a lot also for the hint about rdf_loader_run(). In this case, are you suggesting me to open several isql prompts, and distribute the files to load on each process with ld_add('file1'), ld_add(file2), etc., and then launch rdf_loader_run() from each prompt, or the \"ld_add\" is not necessary? Thank you very much, roberto u0€ *†H†÷  €0€1 0 + uHi All This is a general notification to the mailing list that the \"SR133: Cannot set NULL to not nullable column ‘DB.DBA.RDF_QUAD.O’ “ errors reported when loading some of the Dbpedia 3.7 datasets into Virtuoso can be resolved by setting the “ShortenLongURIs” parameter in the Virtuoso configuration file as detailed at: Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // 10 Burlington Mall Road, Suite 265, Burlington MA 01803 Weblog" "DBpedia Mappings Statistics" "uHi all, we have created a statistics page for the DBpedia infobox mappings: This page can be used to get an idea of which new mappings would be most effective. If you would like to contribute to increase the number of mappings, please visit to raise the amount of high-quality data in DBpedia. The statistics reflect the current state of the mappings in respect to the actually used templates and properties in Wikipedia. They are based on the dump dated 2011-06-20. The stats are updated live against this dump. At the moment, the statistics only exist for English, but we are working on collecting them for the other languages as well and will make them available soon. Thanks to everybody who is helping to improve DBpedia! Cheers, Paul & Max uHi all, the mapping statistics are now available also for: Greek(el) Hungarian(hu) Polish(pl) Portuguese(pt) Slovene(sl) I have restricted the access to the ignore list, because there were some strange edits, i hope only caused by search robots and not by vandalism. happy mapping, paul uOn 20 July 2011 14:30, Paul Kreis < > wrote: Thanks! Can I get the following added to the pl ignore list? Cytuj książkę (\"Cite book\") Cytuj stronę (\"Cite webpage\") Przypisy (\"Footnote\") uHi again, new mapping statistics for: Catalan(ca) German(de) Spanish(es) French(fr) Irish(ga) Croatian(hr) Italian(it) Dutch(nl) Russian(ru) http://mappings.dbpedia.org/server/statistics/ru/ Turkish(tr) http://mappings.dbpedia.org/server/statistics/tr/ happy mapping, paul" "Machine readable property mappings" "uHi all, Is there a file with the Infobox property mappings? For instance, for the person infobox, the mappings are in the Wiki: that the Infobox template property \"name\" is mapped to \"foaf:name\" A csv or similar would be ideal (I have looked in the DBPedia source, in dbpedia/dbpedia/ontology, but it is not clear to me what each file is). Thanks, Guillermo Hi all, Is there a file with the Infobox property mappings? For instance, for the person infobox, the mappings are in the Wiki: Guillermo uGuillermo, I believe this is stored in a MySQL database that is populated/read by the mappings wiki. It would be great to have this in a MongoDB or other store that would be convenient to store a mapping as a document with fields or object with attributes that we could then query under many perspectives: - all mappings using a given property, - all properties for a mapping - all infobox properties with high string similarity to a given ontology property, - etc Is this somenthing similar to your intended use of the mappings? Would you be interested in contributing something like this to the community? Cheers, Pablo On May 11, 2011 8:22 AM, \"Guillermo Garrido\" < > wrote: uHi, I'm new to the DBPedia codebase, so: ¿how do I build this MySQL database? I imagine is from the sourcecode inside the ontology namespace, but the procedure to do it does not seem to be documented. Yes, I would be happy to do this; we should discuss the best way to do it. Cheers, Guillermo uHi Guillermo, On May 11, 2011, at 6:14 PM, Guillermo Garrido wrote: the mappings and ontology pages are stored in the MediaWiki database. But I guess you'd rather like to have a mapping file for infoboxes and their properties to the corresponding ontology properties and classes? At least your first mail sounded like that. This would not be possible to provide other than it is because the DBpedia Mapping Language does allow you various ways to map infoboxes. For example you could map infoboxes to different classes based on property values in the infobox. Can you describe your goal / main interests in that? Cheers, Anja" "Credits page: please acknowledge Wikipedia / MediaWiki" "uRe The 'credits' for dbpedia don't currently acknowledge any contribution outside of the core DBpedia project. This is natural as a 'who are the dbpedia team?' page (ie. who made this site/software), but since you link the page as 'Credits' from the homepage, and have Credits at the top, it would be appropriate to offer some broader acknowledgements. Even if it's somehow obvious that DBpedia is from/for/by Wikipedia-lovers, sometimes it is nice to state the obvious. In conversation people casually conflate DBpedia with the *data*, and the credits for the data include the thousands at Wikipedia who edited all those pages. May I take the liberty of suggesting some text? \"\"\" DBpedia is essentially a data-oriented interface to the work of the Wikipedia community , and wouldn't exist without the massive contributions made by editors and authors at Wikipedia, or without the developers who built the underlying infrastructure that supports Wikipedia . DBpedia is intended to show how the information spread across many Wikipedia pages can be combined into a single integrated database. We hope this will make it easier for the amazing amount of information in Wikipedia to be used in new and interesting ways, and that it might inspire new mechanisms for navigating, linking and improving the encyclopaedia itelf. DBpedia is our way of saying \"thank you!\" for Wikipedia\"\"\" cheers, Dan uHi Dan, thank you for pointing this out and sorry that we did not come up with this ourself. I have added your text to the credits page and also to the project outline on the startpage. Thanks and cheers, Chris uOn Mon, Nov 23, 2009 at 12:11 PM, Chris Bizer < > wrote: Thanks for the quick fix, and to you all for a great project :) cheers, Dan" "help : All data in CSV format for a particular entity" "uHi, I browsed a sample for available data at columns, so I downloaded the CSV files from Now, when I browse the data for the same entity \"Sachin_Tendulkar\", I find that many of the properties are not available. e.g. the property \"dbpprop:bestBowling\" is not present. How can I get all the properties that I can browse through Regards uHi Abhay, the DBpediaAsTables dataset only contains the properties in the dbpedia-owl namespace (mapping-based infobox data) and not those from the dbpprop (raw infobox properties) namespace (regarding the differences see [1]). However, as you are only interested in the data about specific entities, take a look at the CSV link at the bottom of the entity's description page, e.g., for your example this link is [2]. Cheers, Daniel [1] [2] On 22.11.2014 10:43, Abhay Prakash wrote:" "Dependency on extraction-framework" "uHello, Sorry for maybe a trivial question but how do I compile another project that depends on extraction-framework? Is the latest artifact published somewhere? I am trying to work on distributed-extraction-framework (which depends on the 4.x SNAPSHOT). The workaround I have right now is to clone extraction-framework and install it locally before compiling distributed-extraction-framework, which is might not very clean (and time/resource consuming on travis-ci for example). Thanks, uHi Duong Dang, We plan a code maintenance on the distributed extraction framework but till then you can use the latest DBpedia artifacts from maven central We also welcome any code contributions on the (distributed) extraction framework via pull requests Cheers, Dimitris On Tue, May 10, 2016 at 12:04 PM, Duong Dang < > wrote: uHi Dimitris, Thanks for the prompt reply. It is working now for me on travis-ci. I'll contribute for sure when I have something. Regards, On Tue, May 10, 2016 at 11:14 AM, Dimitris Kontokostas < > wrote:" "Skos subject properties are deprecated" "uHi, I am new to this list, but in a discussion on another list we were discussing the use of the skos:subject and related items, something which dbpedia has invested in heavily to represent the wikipedia category system. The latest SKOS draft has deprecated these properties. What will dbpedia use instead? Peter Ansell uPeter, On 24 Jan 2008, at 05:41, Peter Ansell wrote: Can you give us some background on this decision? I have a hard time understanding why this step was taken. I don't know. Do you have any suggestions? Richard uHello all Some precisions before anyone gets carried away :-) The latest SKOS draft Peter mentions is certainly the editor's draft at This is only an editor's draft and has no official status whatsoever The skos:subject property is mentioned as \"at risk\", which means its relevancy is questioned. It's not *deprecated* so far AFAIK, but under discussion. There are two related \"open issues\" on this former : \"The SKOS model should contain mechanisms to attach a given resource (e.g. corresponding to a document) to a concept the resource is about, e.g. to query for the resources described by a given concept.\" I think this is obvious. Otherwise what is the point of SKOS altogether? The property skos:subject was (and still is) candidate to support this mechanism. As has been pointed in e.g., the other ongoing thread on dbpedia list [1], the term \"subject\" can appear to be too specific in meaning to cover all cases of linking a resource to a concept, and strange in some borderline cases. But it's more a question of terminology than a question of need of such a generic property. In the referenced thread, I think the criticism should be more interpreted as a weird construction of Wikipedia categories (some are very weird indeed) than as a mistake in using skos:subject in DBpedia to represent the Wikipedia categorisation. My take on this is that such a generic property is needed and should not be deprecated. Since a lot of people (including dbpedia folks, but not only) have started using skos:subject in the above quoted very generic sense, and I think they are OK to do so, it should be kept as is. But it should be put in best practices that whenever you want to specify an indexing property, you define a specific subproperty of skos:subject. SKOS specification should stress and explain what the *functional* semantics of this property are, and are not. Simply to *retrieve resources* indexed on a concept. Not to infer any specific semantics on the indexing link. Just : \"If you are interested in this concept, here are resources dealing about it in some way\". No more, no less. If you want to be more specific, use a specific subproperty. Bernard [1] Richard Cyganiak a écrit : uAll, I agree with Bernard that SKOS needs a property for attaching resources to concepts. The problem with skos:subject at the moment is this: The Core Guide gives the impression that the domain of skos:subject is documents only. But there is no explicit domain declared in the vocabulary definition. Furthermore, several parties (e.g. DBpedia) have a clear need for a property that relates non-document resources to skos:Concepts. I think there are three options for resolving this: A) Clarify that the domain of skos:subject is indeed any resource, and that the term “subject” is used loosely here. B) Clarify that the domain of skos:subject is documents only, and introduce a new super-property of skos:subject that explicitly covers any resource. It could be named for example skos:category, or skos:indexedAs. C) Clarify that the domain of skos:subject is documents only, and leave the task of defining a property non-document resources to others. My preference would be, in that order, B), A), C). Thanks, Richard On 24 Jan 2008, at 12:45, Bernard Vatant wrote: uOn 24 Jan 2008, at 13:36, Mikael Nilsson wrote: Like skos:subject, dcterms:subject seems to be intended for use on documents, not people or cities. Hence it doesn't really meet DBpedia's requirements. Richard uOn 24 Jan 2008, at 14:13, Mikael Nilsson wrote: We have a skos:Concept for “History of the Internet”. We have Tim Berners-Lee, the person, as a resource. The question is how to relate them. Saying that the “subject” of the person “Tim Berners-Lee” is “History of the Internet” is a bit of a stretch. My problem is not with the term “resource”, but with the term “subject”. I don't doubt that DC properties, in general, are applicable to all kinds of resources. But some sorts of resources don't really have subjects. Best, Richard uLeonard, and all Leonard's point goes along the same line as my previous message. Let's extend the notion of document to be equivalent to \"resource\", and we're done. That was all the point of the URI specification to begin with. Another hit on that nail : The notion of document is extended in many ways in various communities to anything bearing information. The first information on a thing being to be made distinct from the continuum of the universe, through naming, identifying, and asserting distinctive properties (such as a URI), as long as something is identified, it bears information, hence can be considered a document. A bit farfetched maybe, but closing the debate, conceptually and technically. Bernard PS : I vote for \"A\", of course. Leonard Will a écrit : uRichard Think about TBL as a (living) document, and you'll see it another way. Why not? What is the fundamental difference between : On the subject 'History of Internet', see ' On the subject 'History of Internet', see ' None of those assertions says how you will *use* those resources : read the story, follow the link, or ask/write/phone TBL himself if you can They both say that the resource is relevant to the subject. Bernard uOn 24 Jan 2008, at 15:26, Simon Spero wrote: Sorry, but you lost me there. Where I live, people are not documents, and I like it here. Richard uRichard Cyganiak wrote: Sure people are not documents, but People and Documents are concepts that can be associated with one another. Put differently Data Object of Type Person is a Concept and Data Object of Type Document is another. Both can be associated in a Data Graph using the appropriate link/predicate. Kingsley uAntoine Isaac a écrit : Briet, S. (1951). /Qu'est que la documentation?/. Paris: Editions Documentaires Industrielles et Techniques. See references, including Simon's reference to Otlet Note that this way older than most of us are, even including myself :-) I would not say \"TimBL's description on Wikipedia\", but \"TimBL as defined in his Wikipedia description\" If that case, the problem would not be in DBpedia interpretation, but in Wikipedia categorisation, which should change then to, well, say \"Dating profile\"?. The real problem here is to know if TimBL has only one identity, defined by a single URI (or the set of all its owl:sameAs equivalents), to which a consistent set of assertions can be attached. Seen at the level of current DBpedia, the answer is assumed to be \"yes\". But the more precisely you will look at it, the more you will need to split this so-called individual in a cluster of avatars (the child, the student, the CERN engineer, the tax-payer, the patient, the Web inventor, etc ), which will at some point need different URIs and different descriptions if you want to keep some consistency. That will happen first in Wikipedia, when an article will be split in several ones, because the subject description has gained in complexity and accuracy. I guess DBpedia will synchronise, but what will happen to deprecated URIs? Astronomers know that, a single star to the naked eye is a multiple system in instruments, the number of components growing with the power of the instrument. In brief, looking closely at any thing leads most of the time to the loss of its identity/individuality. This fractal and evolving nature of reality we'll need to take into account in our systems, when semantics go beyond the naive notion of the world as a set of well-identified, pre-existing things. We've just started scratching the surface of all this I'm afraid. Bernard uAntoine, On 24 Jan 2008, at 18:08, Antoine Isaac wrote: I love this example! The common sense part of my brain says: “What are these guys smoking?” While the purely logically trained part of my brain says: “Yes, this makes total sense. Obviously an antelope in a zoo is a document.” It's a fascinatingly complicated issue. I don't think it would be problematic. I want to put the man into the “history of the net” bucket, not any particular description of him. The description's purpose is just to establish whom we are talking about. “TimBL is about the history of the net” is a weird statement, I agree with you on that point (and probably disagree with Bernard). But that's part of my argument for a new property: I feel that “A skos:subject B” carries a certain implication, in natural language, that “A is about B”. I would prefer having another property that does not carry that implication. “A skos:indexedIn B” uRichard Cyganiak wrote: uHi I'm only starting to stroll in the foothills of mount semantic web but I have been attempting to use skos subjects in the mashups I've been experimenting with so I'm risking a probably naive intrusion into this debate . One of my mashups is a map of football stadiums in England: in which I make use of the skos:subject dbpedia:Football_venues_in_England in the underlying SPARQL query but if I take a typical resource (Abbey Stadium in Cambridge) in this category I find these skos:subjects: * dbpedia:Category:Buildings_and_structures_in_Cambridgeshire * dbpedia:Category:Cambridge * dbpedia:Category:Cambridge_United_F.C. * dbpedia:Category:Football_venues_in_England These are all logically redundant * dbpedia:Category:Cambridge - is the same as p:location * dbpedia:Category:Cambridge_United_F.C. is inverse of dbpedia:Cambridge_United_F.C. p:ground ?r * dbpedia:Category:Football_venues_in_England should be deducible given Cambridge_United is a football club (although this is not obviously so in the dbpedia data) * dbpedia:Category:Buildings_and_structures_in_Cambridgeshire is a conjunction of rdfs:type yago:Building102913152 and some geo data set which knows that Cambridge is in Cambridgeshire) This leads me to think that wikipedia categories are (nearly?) always redundant derived relationships which, since manually created or roboted) are likely to be also imperfect. Since there is no end to the derived relationships which could be materialised into such categories, perhaps it's a waste of time wondering what they really are or perhaps they should be labeled as derivedRelationship and linked to the rule which is their basis. Chris Wallace UWE Bristol This email was independently scanned for viruses by McAfee anti-virus software and none were found DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\" 1. RE : [ISSUE-77] [ISSUE-48] Re: Skos subject properties are deprecated" "Announcing Virtuoso Open-Source Edition, Version 6.1.2" "uHi, OpenLink Software is pleased to announce the official release of Virtuoso Open-Source Edition, Version 6.1.2: New product features as of July 09, 2010, V6.1.2, include: * Database engine - Added FOAF+SSL based authentication for ODBC, JDBC, OLEDB, and ADO.NET client connections - Added support for following http redirects automatically - Added extra graph delta-engine functions with regards to diff, iteration over dictionary, and obtaining the biggest possible iri_id on given 32-/64-bit platform - Added initial support for Python Runtime hosting via bif_python_exec - Added client- and server-side Semantic Pingback APIs - Added ODBC setting WideasUTF16 to return UTF-16LE for SQLWCHAR - Fixed ODBC setting for UTF-8 when DB keeps UTF-8 in VARCHARs - Fixed ODBC SQLAllocStmt issues exposed when using QtSQL's ODBC layer - Fixed HTTP, SOAP, XML-RPC when used with proxies and reverse-proxies - Fixed Conductor UI for handling FOAF+SSL WebIDs for ODBC/SQL session logins - Fixed handling of column default value of 0 - Fixed support for BIGINT in parameter marshalling - Fixed issue with default maxmempoolsize - Fixed issue with extent map and free pages map - Fixed memory leaks - Fixed issue with freelist chain - Fixed issue with partitioned TOP ORDER BY - Updated documentation * SPARQL and RDF - Added Sponger cartridges for CSV, Etsy.com, FaceBook, OpenGraph, Idiomag, Tumbler, Vimeo, Wine.com, Upstream.tv, and others - Added more assertions to facets ontology - Added rdfs:label to default IFP based inference Rule - Added support for extra encodings - Added initial support for OData's Atom and JSON feed formats with regards to Linked Data Graph Serialization - Added support for gz and zip compressed CSV - Added CSV parser strict mode option - Added CSV parser lax mode - Added optimization for large descriptions on about page - Fixed EAV and SPO labeling modes consistency - Fixed add escape to CR/LF in JSON format - Fixed OData, Tesco.com, and HTML5 MicroData cartridges - Fixed generation of unique graphs lists - Fixed use label ontology inference rules for automating extraction OF geo coordinates - Fixed SPARQL handling of DISTINCT - Fixed SPARQL UNION selections - Fixed SPARQL statement with implicit GROUP BY; do not remove ORDER BY - Fixed RDFa parsing of @rel and @rev - Fixed abnormally long RDFa parsing of document with i18n URIs - Fixed support for subproperties of Inverse Functional Properties (owl:inverseFunctionalPropery) - Fixed support for loading inference rules from multiple ontology graphs - Fixed GPF in SPARUL INSERT optimization - Fixed issue with extra NULLs in HASH JOIN or GROUP BY - Enhanced iSPARQL using new internal RDF store for speed, browser fixes, and cosmetic changes * Native Providers for Jena, Sesame 2, and Sesame 3 - Added support for creating ruleset - Added support for inference graph - Added support for inference and query execution - Added support for query engine interface, so Jena provider now supports the following query execution modes: a) parse and execute query via ARQ b) parse query via ARQ and execute query directly via Virtuoso (new mode) c) parse and execute query directly via Virtuoso - Added support for using Virtuosodatasource - Fixed issue with batch commit - Fixed Jena's lazy initialization when graph is created - Fixed handling of quote chars in literals - Fixed issues with variable binding - Fixed small bugs * ODS Applications - Added OpenID 2.0 login and registration - Added FOAF+SSL registration for users pages (JSP, PHP, VSP, etc.) - Added FOAF+SSL based ACLs for shared resources - Added GoodRelations based Offers as part of Profile Manager - Added support for associating multiple X.509 certificates with a single WebID - Added photo and audio upload for JavaScript, VSP, PHP, and JSP pages - Added Relationship Ontology enhancements to Profile Manager - Added Client and Server support for PubSubHubbub protocol - Fixed OpenID + WebID hybrid protocol handling; reverts back to using the same URL for both OpenID- and FOAF-based Profile Page - Fixed handing of multiple items in Alternate Subject Name slot of X.509 certificate for WebID Protocol - Fixed GoodsRelations integration with SIOC-based Data Spaces as part of richer Profile Data construction - Fixed VTIMEZONE component in iCalendar data representation - Fixed Profile Manager UI associated with GoodRelations Offers - Fixed Profile Manager UI associated with identification of FavoriteThings . Other links: Virtuoso Open Source Edition: * Home Page: * Download Page: OpenLink Data Spaces: * Home Page: * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): OpenLink AJAX Toolkit (OAT): * Project Page: * Live Demonstration: * Interactive SPARQL Demo: OpenLink Data Explorer (Firefox extension for RDF browsing): * Home Page: Best Regards Product Support OpenLink Software Web: Support: Forums: Twitter: OpenLink Software is pleased to announce the official release of Virtuoso Open-Source Edition, Version 6.1.2: New product features as of July 09, 2010, V6.1.2, include: * Database engine - Added FOAF+SSL based authentication for ODBC, JDBC, OLEDB, and ADO.NET client connections - Added support for following http redirects automatically - Added extra graph delta-engine functions with regards to diff, iteration over dictionary, and obtaining the biggest possible iri_id on given 32-/64-bit platform - Added initial support for Python Runtime hosting via bif_python_exec - Added client- and server-side Semantic Pingback APIs - Added ODBC setting WideasUTF16 to return UTF-16LE for SQLWCHAR - Fixed ODBC setting for UTF-8 when DB keeps UTF-8 in VARCHARs - Fixed ODBC SQLAllocStmt issues exposed when using QtSQL's ODBC layer - Fixed HTTP, SOAP, XML-RPC when used with proxies and reverse-proxies - Fixed Conductor UI for handling FOAF+SSL WebIDs for ODBC/SQL session logins - Fixed handling of column default value of 0 - Fixed support for BIGINT in parameter marshalling - Fixed issue with default maxmempoolsize - Fixed issue with extent map and free pages map - Fixed memory leaks - Fixed issue with freelist chain - Fixed issue with partitioned TOP ORDER BY - Updated documentation * SPARQL and RDF - Added Sponger cartridges for CSV, Etsy.com , FaceBook, OpenGraph, Idiomag, Tumbler, Vimeo, Wine.com , Upstream.tv, and others - Added more assertions to facets ontology - Added rdfs:label to default IFP based inference Rule - Added support for extra encodings - Added initial support for OData's Atom and JSON feed formats with regards to Linked Data Graph Serialization - Added support for gz and zip compressed CSV - Added CSV parser strict mode option - Added CSV parser lax mode - Added optimization for large descriptions on about page - Fixed EAV and SPO labeling modes consistency - Fixed add escape to CR/LF in JSON format - Fixed OData, Tesco.com , and HTML5 MicroData cartridges - Fixed generation of unique graphs lists - Fixed use label ontology inference rules for automating extraction OF geo coordinates - Fixed SPARQL handling of DISTINCT - Fixed SPARQL UNION selections - Fixed SPARQL statement with implicit GROUP BY; do not remove ORDER BY - Fixed RDFa parsing of @rel and @rev - Fixed abnormally long RDFa parsing of document with i18n URIs - Fixed support for subproperties of Inverse Functional Properties (owl:inverseFunctionalPropery) - Fixed support for loading inference rules from multiple ontology graphs - Fixed GPF in SPARUL INSERT optimization - Fixed issue with extra NULLs in HASH JOIN or GROUP BY - Enhanced iSPARQL using new internal RDF store for speed, browser fixes, and cosmetic changes * Native Providers for Jena, Sesame 2, and Sesame 3 - Added support for creating ruleset - Added support for inference graph - Added support for inference and query execution - Added support for query engine interface, so Jena provider now supports the following query execution modes: a) parse and execute query via ARQ b) parse query via ARQ and execute query directly via Virtuoso (new mode) c) parse and execute query directly via Virtuoso - Added support for using Virtuosodatasource - Fixed issue with batch commit - Fixed Jena's lazy initialization when graph is created - Fixed handling of quote chars in literals - Fixed issues with variable binding - Fixed small bugs * ODS Applications - Added OpenID 2.0 login and registration - Added FOAF+SSL registration for users pages (JSP, PHP, VSP, etc.) - Added FOAF+SSL based ACLs for shared resources - Added GoodRelations based Offers as part of Profile Manager - Added support for associating multiple X.509 certificates with a single WebID - Added photo and audio upload for JavaScript, VSP, PHP, and JSP pages - Added Relationship Ontology enhancements to Profile Manager - Added Client and Server support for PubSubHubbub protocol - Fixed OpenID + WebID hybrid protocol handling; reverts back to using the same URL for both OpenID- and FOAF-based Profile Page - Fixed handing of multiple items in Alternate Subject Name slot of X.509 certificate for WebID Protocol - Fixed GoodsRelations integration with SIOC-based Data Spaces as part of richer Profile Data construction - Fixed VTIMEZONE component in iCalendar data representation - Fixed Profile Manager UI associated with GoodRelations Offers - Fixed Profile Manager UI associated with identification of FavoriteThings . Other links: Virtuoso Open Source Edition: * Home Page: < * Download Page: < OpenLink Data Spaces: * Home Page: < * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): < OpenLink AJAX Toolkit (OAT): * Project Page: < http://sourceforge.net/projects/oat > * Live Demonstration: < http://demo.openlinksw.com/oatdemo > * Interactive SPARQL Demo: < http://demo.openlinksw.com/isparql/ > OpenLink Data Explorer (Firefox extension for RDF browsing): * Home Page: < http://ode.openlinksw.com/ > Best Regards Product Support OpenLink Software Web: http://www.openlinksw.com Support: http://support.openlinksw.com Forums: http://boards.openlinksw.com/support Twitter: http://twitter.com/OpenLink" "DBpedia as Tables release" "uDear all, We are happy to announce the first version of the DBpedia as Tables tool [1]. As some of the potential users of DBpedia might not be familiar with the RDF data model and the SPARQL query language, with this tool we provide some of the core DBpedia 3.9 data in tabular form as Comma-Separated-Values (CSV) files, which can easily be processed using standard tools, such as spreadsheet applications, relational databases or data mining tools. For each class in the DBpedia ontology (such as Person, Radio Station, Ice Hockey Player, or Band) we provide a single CSV file which contains all instances of this class. Each instance is described by its URI, an English label and a short abstract, the mapping-based infobox data describing the instance (extracted from the English edition of Wikipedia), and geo-coordinates (if applicable). Altogether we provide 530 CSV files in the form of a single ZIP file (size 3 GB compressed and 73.4 GB uncompressed). More information about the file format as well as the download link can be found on the DBpedia as Tables Wiki page [1]. Any feedback is welcome! Best regards, Petar and Chris [1] DBpediaAsTables uOn 11/25/2013 02:18 PM, Petar Ristoski wrote: Thanks Petar, your CSV files are really helpful. For all who want to import data into Postgresql, I've written a python script which automatically creates the SQL corresponding to the CSV: The column types (ofter arrays) are inferred from your headers and the data rows; indexes are also created. (If people here find this script useful, I could also package it for pypi and improve documentation a bit.) I was assuming that your files are encoded in UTF-8, which worked, but I didn't find either a '\"\"' or a '\\"' inside a field value, so I don't know how a '\"' would be encoded, if there were one. Also for a multi-value field (e.g. '{1|2|3}') I don't know how '{', '|' and '}' are encoded, if they appear within one of the values. - Maybe you could add some documentation on that. In your data I found 2 format problems (I don't think my download went wrong, but anyway, a checksum might be helpful): * Film.csv seems to have no headers (it has 20004 lines for me). * Aircraft.csv: the 2nd last row ( \" ) has too many columns. All other files (except owl#Thing.csv and Agent.csv, which I didn't check due to size and column number) were ok. I also noticed another thing, not concerning your tool, where some parser maybe could be optimized: has language=\"American (but see [[#English usage\" Regards, ibu u0€ *†H†÷  €0€1 0 + uHi Ibu, Thank you for your feedback. To simplify the parsing of the files, from all literals I removed the following characters: \"\\" { } | , \n\". If there are quotes in the URIs, they are escaped as '\"\"'. Also, there is no URI that starts with\"{\" and ends with \"}\", so there is no need to escape \"{ } |\" inside the URIs. I apologize for those two incorrectly parsed files. I fixed them couple of days ago, so please download them again. Regards, Petar uHi Petar, Thanks for sharing this! Tried to use it yesterday, but 3GB still takes quite a long time to download if you're just hacking something together from a Starbucks. From the standpoint of practicality, this would be infinitely more useful if we could download files individually, or at least in smaller chunks. Any chance we'll get something like that shared from [1]? Cheers, Pablo [1] On Thu, Nov 28, 2013 at 5:35 AM, Petar Ristoski < > wrote: uHi Pablo, Thank you for your interest! No problem, over the next few days I will provide a list of links to each file/table and I will let you know. Regards, Petar From: Pablo N. Mendes [mailto: ] Sent: Thursday, December 12, 2013 4:44 PM To: Petar Ristoski Cc: ibu ☉ radempa ä·°; Subject: Re: [Dbpedia-discussion] DBpedia as Tables release Hi Petar, Thanks for sharing this! Tried to use it yesterday, but 3GB still takes quite a long time to download if you're just hacking something together from a Starbucks. From the standpoint of practicality, this would be infinitely more useful if we could download files individually, or at least in smaller chunks. Any chance we'll get something like that shared from [1]? Cheers, Pablo [1] On Thu, Nov 28, 2013 at 5:35 AM, Petar Ristoski < > wrote: Hi Ibu, Thank you for your feedback. To simplify the parsing of the files, from all literals I removed the following characters: \"\\" { } | , \n\". If there are quotes in the URIs, they are escaped as '\"\"'. Also, there is no URI that starts with\"{\" and ends with \"}\", so there is no need to escape \"{ } |\" inside the URIs. I apologize for those two incorrectly parsed files. I fixed them couple of days ago, so please download them again. Regards, Petar uHi Pablo, I set up a web page [1] where all classes from the DBpedia ontology are available for download as separate .csv and .json files. Regards, Petar [1] From: Pablo N. Mendes [mailto: ] Sent: Thursday, December 12, 2013 4:44 PM To: Petar Ristoski Cc: ibu ☉ radempa ä·°; Subject: Re: [Dbpedia-discussion] DBpedia as Tables release Hi Petar, Thanks for sharing this! Tried to use it yesterday, but 3GB still takes quite a long time to download if you're just hacking something together from a Starbucks. From the standpoint of practicality, this would be infinitely more useful if we could download files individually, or at least in smaller chunks. Any chance we'll get something like that shared from [1]? Cheers, Pablo [1] On Thu, Nov 28, 2013 at 5:35 AM, Petar Ristoski < > wrote: Hi Ibu, Thank you for your feedback. To simplify the parsing of the files, from all literals I removed the following characters: \"\\" { } | , \n\". If there are quotes in the URIs, they are escaped as '\"\"'. Also, there is no URI that starts with\"{\" and ends with \"}\", so there is no need to escape \"{ } |\" inside the URIs. I apologize for those two incorrectly parsed files. I fixed them couple of days ago, so please download them again. Regards, Petar uHi Petar, There is some interest to revive this project and cannot recall / find where is the code to generate these dumps. Will you be able to help us re-bootstrap this? We can create a standalone github repo and we will try to find a new maintainer Cheers, Dimtiris On Fri, Dec 13, 2013 at 2:07 AM, Petar Ristoski < > wrote: uPerhaps here: On Sun, Nov 6, 2016 at 10:24 AM, Dimitris Kontokostas < > wrote: uThanks Tom, Petar also pointed this to me On Sun, Nov 6, 2016 at 11:27 PM, Tom Morris < > wrote:" "DBpedia Extraction Framework Dutch disambiguation data set" "uHello, I'm interested in creating a Dutch port of DBpedia Spotlight. In order to do this, I need a disambiguation data set for Dutch. This data set is currently not available for download. However, based on some messages posted here [1], I suspect that the latest version of the extraction framework supports this. Is this correct? I already imported the extraction framework and it builds successfully (I only included the core, dump and scripts modules as [2] states that the other modules are not necessary for running the extraction.). The messages posted at [3] indicate that it is only necessary to run download and extract. However, when executing download (using the command /run download config=download.properties), the following message is displayed: [INFO] Scanning for projects [INFO] uHi, On Wed, Sep 19, 2012 at 3:46 PM, Pedro Debevere < > wrote: Generally yes, if all names of disambiguation templates are specified in [4]. Please also note that there seems to be an issue with multiple names for disambiguation page titles in dutch. See the TODO in [5]. On your first attempt, it looks like something goes wrong during download. So downloading and unpacking yourself was a good idea. Wikipedia seems to have changed its export format version from 0.6 to 0.7. The DBpedia parser should still be able to parse the dump, assuming the changes mentioned in [6]. You can try to switch to the dump branch (currently the stable one) and change the line in [7] to private final String _namespace = \" and try again. (Call mvn clean install on the project root before). Cheers, Max [4] [5] [6] [7] WikipediaDumpParser.java#l74 uHi Pedro, Whenever you get all the DBpedia datasets you need, it's time to run DBpedia Spotlight indexing. You will be glad to know that we've been working on a step-by-step guide to build DBpedia Spotlight for other languages. Check this out: You should also coordinate with Dimitris Kontokostas, who has been up to now my contact for the Dutch DBpedia, and the one I had been including in the DBpedia Spotlight i18n thread. Perhaps you can help each other out. We hope to have a much improved (more automated) indexing process in the next couple of days, so keep in touch. Please join dbp-spotlight-users for questions about DBpedia Spotlight. Cheers, Pablo On Wed, Sep 19, 2012 at 4:45 PM, Max Jakob < > wrote: uHello Max, Thank you for your prompt reply! After switching to the dump branch and changing the namespace (and executing mvn clean and install on project root), the following is displayed: mvn scala:run \"-Dlauncher=extraction\" \"-DaddArgs=extraction.properties\" [INFO] Scanning for projects[INFO] uOn Thu, Sep 20, 2012 at 10:17 AM, Pedro Debevere < > wrote: I think you might have to use Java 7 in order to allow 'X' in date format strings. Compare: For that, you'll need to change in the main pom to 1.7 and have Java 7 on your machine. It seems that in the Dutch Wikipedia, the URLs of disambiguation pages are suffixed with multiple different strings (and sometimes none as you mentioned). All these strings have to be known for the process of extracting disambiguation links, but currently the configuration only allows for one. This should be extended to a set of strings. Cheers, Max uHi Max, This was indeed the problem. Now the extraction proceeds a little further (creating the nlwiki-20120824-template-redirects_old.obj file) but it then displays the following: mvn scala:run \"-Dlauncher=extraction\" \"-DaddArgs=extraction.properties\" [INFO] Scanning for projects[INFO] uLooks like your dump has 0.6. Can you pls delete the dbpedia jars from your .m2 dir, run mvn clean install and double check your wikipedia dump to make sure everything is clean? On Sep 20, 2012 6:16 PM, \"Pedro Debevere\" < > wrote: uHi pedro, This is an inconsistency between wikipedia and mappings wiki (the latter is still in 0.6). To generate both at the same extraction you need to set the namespace variable to null. About the spotlight, I already started the generation process. If you still want to start and still have trouble with the dataset generation I could send you \"my datasets\" to work with :) Best, Dimitris On Thu, Sep 20, 2012 at 6:42 PM, Pablo N. Mendes < >wrote: uSetting _namespace = nul does the trick. Thank you all for helping me out. Hi pedro, This is an inconsistency between wikipedia and mappings wiki (the latter is still in 0.6). To generate both at the same extraction you need to set the namespace variable to null. About the spotlight, I already started the generation process. If you still want to start and still have trouble with the dataset generation I could send you \"my datasets\" to work with :) Best, Dimitris On Thu, Sep 20, 2012 at 6:42 PM, Pablo N. Mendes < > wrote: Looks like your dump has 0.6. Can you pls delete the dbpedia jars from your .m2 dir, run mvn clean install and double check your wikipedia dump to make sure everything is clean? On Sep 20, 2012 6:16 PM, \"Pedro Debevere\" < > wrote: Hi Max, This was indeed the problem. Now the extraction proceeds a little further (creating the nlwiki-20120824-template-redirects_old.obj file) but it then displays the following: mvn scala:run \"-Dlauncher=extraction\" \"-DaddArgs=extraction.properties\" [INFO] Scanning for projects[INFO]" "Inconsistency in DBpedia?" "uDear all, includes the following key-value pair: \" \" However, includes no mention of the UCLA website. Please could someone help me to understand why this discrepancy exists? Many thanks, Sam uOn 11/05/2013, Sam Kuper < > wrote: I'm still unsure why the discrepancy existed, and would be grateful to know; but am happy to say that it no longer exists. Regards, Sam uThe datasets are stable for ~9 months so I guess this was Virtuoso's AnytimeQueries feature [1], [2] Best, Dimitris [1] [2] On Sat, May 11, 2013 at 6:18 PM, Sam Kuper < > wrote: uOn 13/05/2013, Dimitris Kontokostas < > wrote: Thank you for the info and the links. Would the presence of AnytimeQueries also explain why the following query sometimes yields the correct result but other times yields \"[no results]\"? Regards, Sam uProbably yes. Maybe someone from OpenLink can also verify this. Best, Dimitris On Mon, May 13, 2013 at 4:45 PM, Sam Kuper < > wrote: u0€ *†H†÷  €0€1 0 +" "What is the difference between resources types ?" "uThere are many types that could be attributed to a resource : dbpprop:type,dbpedia-owl:type and rdf:type. What the difference between them ? what is the most accurate ? There are many types that could be attributed to a resource : dbpprop:type, dbpedia-owl:type and rdf:type. What the difference between them ? what is the most accurate ?" "My first SPARQL test" "uDear dpbedia discussion mailing list, I'm trying to make nice queries to show the power of SPARQL to fellow researchers. I found it quite hard to get started to drill down to the correct query format. Let me describe my workflow of constructing my first query. Maybe you can point out how I could have done better. First I came up with a question: \"List the universities of the Netherlands in order of establishment\". I used the \"select distinct ?Concept where {[] a ?Concept} LIMIT 100\" query at construct my query. With CTRL-F I searched for 'University' and found the concept \" Next I used: \"prefix schema: select distinct ?uni where {?uni a schema:CollegeOrUniversity}\" I found many USA colleges and universities and went to the dbpedia page of one of them. There I found the country and established property. Now I used a Python script with the SPARQLWrapper library to do my next query (I want to be able to get the output in command line). I ended up with the following query: PREFIX dbpedia-owl: PREFIX dbpedia: PREFIX dbpprop: SELECT ?est ?name WHERE { ?uni a dbpedia-owl:University. ?uni dbpedia-owl:country dbpedia:Netherlands. ?uni dbpprop:nativeName ?name. ?uni dbpprop:established ?est } ORDER BY ?est LIMIT 100 But the results show 3 weird glitches: 1) The \"Conservatorium Maastricht\" ( property of 19. I searched for the reason for this weird number and found that the wikipedia article had 19?? stated as the established date in all the articles before the 12 October 2011 revision. My question with this glitch is: Can we add the revision information in the rdf triples of dbpedia? That makes it easier to find out why a particular problem occurs and it shows how current the information is. 2) The \"Technische Universiteit Eindhoven\" ( established property of 23. I think this is due to the fact that this property is parsed as an xsd:integer, which only works if it only contains a year of establishment, but we can see on entire date is given in the infobox. Instead of finding 1956, it uses the first number it encounters. 3) The \"International University in Hospitality Management\" ( that seems to be a floating point value. I could not find the cause of this. Do these problems occur often in dbpedia? How can I report them properly? Is this the correct place (I cannot access the bugtracker)? Is this the 'normal' workflow to construct queries to dbpedia? Can I suggest to add the 'retrieved_from_revision' relation to the resources? Thank you for your time, Joris Slob, PhD Bioinformatics, Leiden University Dear dpbedia discussion mailing list, I'm trying to make nice queries to show the power of SPARQL to fellow researchers. I found it quite hard to get started to drill down to the correct query format. Let me describe my workflow of constructing my first query. Maybe you can point out how I could have done better. First I came up with a question: 'List the universities of the Netherlands in order of establishment'. I used the 'select distinct ?Concept where {[] a ?Concept} LIMIT 100' query at University" "Statements missing from dump?" "uHello! I was recently looking at: It has one statement extracted from its infobox - birthPlace. However, mappingbased_properties_en.nt for 3.8 does not have any statements about that URI. As I understand it, it should have all properties extracted from the infobox, so why would that particular one be missing? Best, Yves uHi Yves, This triple is generated from the PersonDataExtractor and you should be able to find it here: best, Dimitris On Mon, Mar 11, 2013 at 6:35 PM, Yves Raimond < >wrote: uHello! Ah, that's great - thank you! Yves uHello! Quick other question, in which dump are the east/north/west/south properties available, e.g. for They seem to be missing from the main mappingbased_properties dump. Best, Yves On Mon, Mar 11, 2013 at 5:17 PM, Yves Raimond < > wrote: uHi Yves, these are in the dbprop namespace so I'd expect them in the Infobox_properties_* dump files the ones you mention should be here: Best, Dimitris On Fri, Mar 15, 2013 at 12:31 PM, Yves Raimond < >wrote: uThanks Dimitris - that's it. Best, y On Fri, Mar 15, 2013 at 10:40 AM, Dimitris Kontokostas < > wrote:" "Announcing OpenLink Virtuoso, Open-Source Edition, v5.0.8" "uHi, OpenLink Software is pleased to announce a new release of Virtuoso, Open-Source Edition, version 5.0.8. This version focuses on performance optimizations and speed enhancements: + SQL compiler is now re-entrant + Self-Join optimizations exposed at JDBC Driver level (as was already the case with ODBC) + SPARQL engine and SPARQL-BI extension optimizations have been merged + TriG serialization format for RDF is now supported alongside RDFa, N3, Turtle, and RDF/XML + Additional Sponger Cartridges for Digg, FriendFeed, and CrunchBase + Improved graph quality and fidelity from existing Cartridges (especially Freebase, eBay, Amazon, Google, Yahoo, and many others) + Improved handling of and bug fixes relating to the `OPTIONAL' SPARQL keyword + Self-Dereferencing fixes (e.g., Sponger was not properly de-referencing its own Proxy URIs) On the ODS front, the following have been addressed: + More flexible Mapping service model based on new OAT-based Mapping Control (which also includes a \"province\" locator) + Improved SyncML integration with Briefcase folders + Gem URL fixes for Atom, RSS, and RDF feeds For more details, see the release notes: Other links: Virtuoso Open Source Edition: + Home Page: + Download Page: OpenLink Data Spaces: + Home Page: + SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): OpenLink AJAX Toolkit (OAT): + Project Page: + Live Demonstration: + Interactive SPARQL Demo: OpenLink Data Explorer (Firefox extension for RDF browsing): + Home Page: Regards, ~Tim uHi Tim, I installed it, but seems it doesn't work. :/disk1/sda/datasets$ isql OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. SQL> select * from DB.DBA.SYS_USERS; Error S2801: [Virtuoso Driver]CL033: Connect failed to localhost:1111 = localhost:1111. at line 1 of Top-Level: select * from DB.DBA.SYS_USERS Virtuoso was installed with default settings. Thank you in advance. On Thu, Aug 28, 2008 at 10:47 PM, Tim Haynes < >wrote: uHi Jiusheng, How exactly have you installed ( although I presume you mean \"compiled\" really ?) the Virtuoso open source archive and on what OS ? Where is your Virtuoso database instance running from and are you sure it is running ? Can you provide a copy of the \"virtuoso.log\" file which will confirm if the database was running at the time you are attempting to connect. Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 1 Sep 2008, at 08:57, jiusheng chen wrote: ujiusheng chen wrote: Hi, This looks like you haven't started it. From the ${prefix}/var/lib/db/ directory, run virtuoso-t -df to make it start in the foreground and once it claims to be online, isql should work. If not, please include teh server debug output too. Regards, ~Tim uThank you, guys. You are right, there needs to start virtuoso-t explicitly before using isql. But it seemed to me this step is not necessary for version 5.0.7. On Mon, Sep 1, 2008 at 5:01 PM, Tim Haynes < >wrote: uHi Jiusheng, The Virtuoso isql program is a simple Interactive SQL client side tool for accessing a Virtuoso server instance on the specified hostname and port number (localhost, 1111 being defaults), and as such will always assume the specified server has already been started by some other means regardless of version. If their is specific documentation you read that gave the impression that isql starts a Virtuoso instance then please provide a link or details of it such that we can have it corrected or made clear as to what isql does Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 2 Sep 2008, at 02:21, jiusheng chen wrote: uHi Hugh, Thanks for your reply. Perhaps I got a incorrect impression before. BTW, how to stop the server instance? just kill the virtuoso_t job? On Tue, Sep 2, 2008 at 4:58 PM, Hugh Williams < >wrote: uHi Jiusheng, The isql program has a \"shutdown\" command that can be used to shutdown the Virtuoso instance it is connected to: $ ./isql 1111 Connected to OpenLink Virtuoso Driver: 05.00.3033 OpenLink Virtuoso ODBC Driver OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. SQL> shutdown(); $ Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 3 Sep 2008, at 08:09, jiusheng chen wrote: uHi Hugh, To speed up loading large dataset like dbpedia into one PC, if parallel loading (split large data into several segments, loading these segments in parallel) is effective? any further suggestions? On Wed, Sep 3, 2008 at 5:27 PM, Hugh Williams < >wrote: ujiusheng chen wrote: Hi Jiusheng, I've usually had about 3 ISQL processes per core loading the data set. I didn't bother to split the dbpedia dataset into any smaller segments than it was to begin with. You'll need RAM - plenty of it. A good rule of thumb would be to give Virtuoso at least 2/3 of total RAM in the system. See section [Parameters] in virtuoso.ini - the ones you're interested about are NumberOfBuffers and MaxDirtyBuffers. A handy tip: you can spawn a new ISQL process when you run a stored procedure/query by using the ampersand in the ISQL command line. I.e. SQL> my_lengthy_stored_proc()& Best Regards, Yrjänä uHi Guys, Can I run several Virtuoso server instances on the same machine (with different port number)? if any risk exist then? I found sometimes the server instance work well but sometimes exited for no reason. Environment: OS: Debian 2.6.18.1.2007050801 Open source Virtuoso v5.0.8 On Wed, Sep 3, 2008 at 11:22 PM, Yrjänä Rankka < > wrote: uHi Jiusheng, You can run multiple Virtuoso server instance on the same machine without problems provided the port numbers the server are running on are unique and the machines has sufficient resources for the intended usage scenarios. I am interested to know about this \"server exits for no reason\" you report, thus can you provide more details on the circumstances in which these occur and your Virtuoso log (virtuoso.log) where the server may have logged any errors etc that resulted in such exits ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 4 Sep 2008, at 10:58, jiusheng chen wrote: uHi Jiusheng, Looking the the log provided the only shutdown type messages I see out of the ordinary today are of the form: 16:35:28 INFO: Checkpoint made, log reused 16:52:55 INFO: Server received signal 1 16:52:55 INFO: Initiating quick shutdown 16:52:55 INFO: Server shutdown complete Where the \"signal 1\" message is normally the result of the server having been started on foreground mode (-f) from a terminal session and then being shutdown with a Ctrl C from the command line. Can you confirm this is how you have been running the Virtuoso Server ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 4 Sep 2008, at 12:25, jiusheng chen wrote: uHi Hugh, we started server instance like this: virtuoso -df -c is it more safe to make it run on background? like: nohup virtuoso -df -c > log 2>&1 & On Thu, Sep 4, 2008 at 8:31 PM, Hugh Williams < >wrote: ujiusheng chen wrote: Kingsley u" "Announcing OpenLink Virtuoso, Open-Source Edition, v5.0.7" "uHi, OpenLink Software is pleased to announce a new release of Virtuoso, Open-Source Edition, version 5.0.7. This release includes: New: fully operational Native Graph Model Storage Providers for the Jena & Sesame Frameworks. Licensing change: the Jena and Sesame providers have been added to the \"Client Protocol Driver exemptions\" paragraph in the VOS License: Improvements: - Better support for alternate RDF indexing schemes - Parallelization of RDF sponger operations across multiple RDF data-sources concurrently - New Sponger Cartridges and enhancements to the existing Cartridge collection - Inference engine optimizations for subclass and subproperty that efficiently handle taxonomies numbering tens of thousands of classes. - OWL equivalentClass and equivalentProperty inference support. - Dynamic handling of host component of IRIs; host component is now flexible enough to painlessly handle multiple homing of domains and host name component changes; no duplicate host name data storage required via [URIQA] section of INI - SPARQL optimizations to improve LIMIT and OFFSET handling - JDBC driver has new connect options, smaller memory footprint and optimized batch support - ODS applications now support SyncML Documentation Additions: - How to read query plans and how to use the key performance meters - How to diagnose SPARQL queries and how to decide what indexing scheme is right for each RDF use case - How to debug RDF views - Better Documentation on SPARQL extensions and options - An updated RDF View example based on the Northwind demonstration database that reflects underlying enhancements Bug Fixes: - Generally improved safety of built-in functions, better argument checking. - Verified UTF8 internationl character support in all RDF use cases, SQL client/SPARQL protocol/all data formats. For more details, see the release notes: Other links: Virtuoso Open Source Edition: Home Page: Download Page: OpenLink Data Spaces: Home Page: SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): OpenLink AJAX Toolkit (OAT): Project Page: Live Demonstration: Interactive SPARQL Demo: http://demo.openlinksw.com/isparql/ OpenLink Data Explorer (Firefox extension for RDF browsing): http://ode.openlinksw.com/ Regards, ~Tim" "DBpedia Mashup using XQuery" "uAs my first experiment in using DBpedia, I've made application using XQuery to map the birthplaces of players in a selected English Football club. It's described on my blog and documented in the XQuery Wikibook article I link to. Using the SPARQL endpoint threw up a small problem which may be just be my lack of understanding of SPARQL syntax: In the query which begins, for example PREFIX : PREFIX p: SELECT * WHERE { ?player p:currentclub . OPTIONAL {?player p:cityofbirth ?city}. I would have chosen to write : ?player p:currentclub :Arsenal_F.C but this throws a syntax error - how can I avoid it other than by using the url form? Incidently, I wonder why the prefix in the SNORQL interface is dbedia2 rather than p? Chris Wallace This email was independently scanned for viruses by McAfee anti-virus software and none were found DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\" DBpedia Mashup using XQuery" "Question about rdf:type for dbpedia URIs as displayed on dbpedia mobile" "uHi there, dbpedia mobile seems to have more rdf:types per resource then dbpedia itself: if I compare the data for a dbpedia URI which comes from dbpedia with the data that dbpedia mobile (through marbles) reports, then there are usually more rdf:types on dbpedia mobile. An example: rdf:type * yago:Landmark108624891 * umbel-sc:MemorialMarker versus (8 types, 7 of them from yago) Does anybody know from where dbpedia mobile got those additional rdf:types ? cheers, Benjamin." ":Basekb Updated to 2012-11-04" "uInfovore, the RDF processing framework that produces :BaseKB has passed a final round of tests in the AWS cloud. As a result, new :BaseKB RDF files are available at Two files have been created from the last quad dumps that have been released by Freebase on 2012-11-04. :BaseKB Pro consists of nearly all nontrivial facts from Freebase, whereas :BaseKB Lite is restricted to topics that exist in DBpedia. By loading either product into a triple store, you can run SPARQL queries that produce correct answers. (:BaseKB lacks on the order of 10^4 facts that cannot be resolved because of ill-formedness in the “one true graph”) This version of :BaseKB differs from previous versions in that the :knownAs predicate is no longer asserted. The infovore framework contains the basekb-tools, which rewrite “human friendly” identifiers into unique mid identifiers, just as the MQL query engine for graphd does. This “grounded SPARQL” is used extensively inside Infovore, even in phases before the major reconstruction of Freebase is available. The toolkit, as shipped, works with and out-of-the-box the open source edition of Openlink Virtuoso. Configuration can be done via the Spring framework, so it is straightforward to adapt to any triple store that uses SPARQL protocol endpoint or otherwise has Jena drivers. I’m finishing up the documentation for the 1.0 release of Infovore, which will be the exact software used to create the latest version of :BaseKB. Optimized for current four-core processors, anyone can use this tool to convert a Freebase quad dump to industry-standard RDF in less than twelve hours; Infovore contains a test suite that demonstrates successful name resolution and SPARQL queries against either :BaseKB Lite or :BaseKB Pro. Infovore, the RDF processing framework that produces :BaseKB has passed a final round of tests in the AWS cloud. As a result, new :BaseKB RDF files are available at released by Freebase on 2012-11-04. :BaseKB Pro consists of nearly all nontrivial facts from Freebase, whereas :BaseKB Lite is restricted to topics that exist in DBpedia. By loading either product into a triple store, you can run SPARQL queries that produce correct answers. (:BaseKB lacks on the order of 10^4 facts that cannot be resolved because of ill-formedness in the “one true graph”) This version of :BaseKB differs from previous versions in that the :knownAs predicate is no longer asserted. The infovore framework contains the basekb-tools, which rewrite “human friendly” identifiers into unique mid identifiers, just as the MQL query engine for graphd does. This “grounded SPARQL” is used extensively inside Infovore, even in phases before the major reconstruction of Freebase is available. The toolkit, as shipped, works with and out-of-the-box the open source edition of Openlink Virtuoso. Configuration can be done via the Spring framework, so it is straightforward to adapt to any triple store that uses SPARQL protocol endpoint or otherwise has Jena drivers. I’m finishing up the documentation for the 1.0 release of Infovore, :BaseKB. Optimized for current four-core processors, anyone can use this tool to convert a Freebase quad dump to industry-standard RDF in less than twelve hours; Infovore contains a test suite that demonstrates successful name resolution and SPARQL queries against either :BaseKB Lite or :BaseKB Pro." "DBPedia dump - Categories (Skos) - other languages ?" "uDear All, I just found out about this very nice Categories extract in SKOS format. But from what I see, there is only the english version. Is there a way to get this kind of feature for other languages ? (I am looking for french and chinese) Thanks for any answer Fabian Cretton uHello, the same problem exists for the German DBpedia [1]. The general problem is that English URIs are used and only data is included, if a matching English article is produced. We are currently working on making the code language independent. i.e. the French MediaWiki category namespace is probably not 'Category:' but 'Catégories:' encoded: Cat%E9gories:, so once we have that sorted out, there will be more language specific versions. We are currently still optimizing the live extraction, after that this is the next on the list. If you are really interested, you might help us and try to run the extraction yourself and see what faulty data it produces. Checkout [2] and configure and run extract_test.php with the article category extractor. You can already configure, that uri should look like dbpedia.ini and compare with dbpedia_default.ini) But it is still a long way to go. Regards, Sebastian, AKSW PS: Note: its DBpedia not DBPedia [1] [2] Fabian Cretton schrieb:" "Will the real URI stand up? [dbpedia vs wikipedia vs the world]" "uI notice lines in the dbpedia dumps that look like . Note the URL encoded %2C=\",\". Anyhow, if I go to I see two redirects [one of which unescapes the comma] and ultimately end up at If I go to Wikipedia I get redirected to which, oddly, displays the same content as \"Boston\" [rather than 301 redirecting] When I do curl -H \"Accept: application/rdf+xml\" I see stuff like Now If I run the SPARQL query select ?Predicate where { ?Predicate } I get nothing, but if I run select ?Predicate where { ?Predicate } I get So it looks like the %-encoded URI is the \"real URI\" in dbpedia. Obviously I ought to keep it around in case I want to run a SPARQL query now and then. Also, dbpedia encodes wikipedia this way as well, . uHi, we try to be as close as possible to the Wikipedia title encoding scheme. The previous %-encoding of comma and ampersand is a bug that will be corrected in the next release. The current behavior is as follows: - The alphanumeric characters \"a\" through \"z\", \"A\" through \"Z\" and \"0\" through \"9\" remain the same. - The special characters \".\", \"-\", \"*\", \"/\", \"&\", \":\", \"_\" and \",\" remain the same (some of them only in the upcoming release, including the comma). - The space character \" \" is converted into a plus sign \"_\". - All other characters are unsafe and are first converted into one or more bytes using UTF-8 encoding. Then each byte is represented by the 3-character string \"%xy\", where xy is the two-digit hexadecimal representation of the byte. - Furthermore, multiple underscores are collapsed into one. The class org.dbpedia.extraction.util.WikiUtil.scala in the framework might also give pointers on how to with this issue. Best, Max On Wed, Oct 13, 2010 at 11:29 PM, Paul Houle < > wrote:" "Require insights" "uHello everyone, I am planning to accommodate DL - Learner suggestions with mapping wiki by allowing user to classify learned description either as good or bad. Bad axioms are rejected and stored for further learning while good axioms are reflected back in ontology and are mapped to mapping wiki template for processing. I would be thankful if anyone can let me know the correctness of beneath list of sub-tasks. 1) Get feedback/result (definition for given concept) from the DL - Learner. 2) Display them to the user to decide good or bad. 3) For good axioms, allow user to accept it (by pressing \"Reflect\" button), which enters the new axiom/definitions in the DBpedia ontology. 4) Save Bad Axioms for the later learning. 5) Convert the displayed axiom to a statement for mapping, using mapping template language. 6) Save updated mapping template. Questions : 1) To accomplish step 5, would there be requirement of new words in the mapping template [1] language as I was not able to find words to present operators like intersection, union, value restriction and existential quantifiers . Regards, Ankur. Hello everyone, I am planning to accommodate DL - Learner suggestions with mapping wiki by allowing user to classify learned description either as good or bad. Bad axioms are rejected and stored for further learning while good axioms are reflected back in ontology and are mapped to mapping wiki template for processing. I would be thankful if anyone can let me know the correctness of beneath list of sub-tasks. 1) Get feedback/result (definition for given concept) from the DL - Learner. 2) Display them to the user to decide good or bad. 3) For good axioms, allow user to accept it (by pressing 'Reflect' button), which enters the new axiom/definitions in the DBpedia ontology. 4) Save Bad Axioms for the later learning. 5) Convert the displayed axiom to a statement for mapping, using mapping template language. 6) Save updated mapping template. Questions : 1) To accomplish step 5,  would there be requirement of new words in the mapping template [1] language as I was not able to find words to present operators like intersection, union, value restriction and existential quantifiers . Regards, Ankur." "The DBpedia Ontology Survey" "uApologies for cross-posting Dear All, As the season of giving is upon us, the DBpedia ontology (DBO) community would like to request a little help from all its well wishers/users. With the aim of improving the ontology, we have drafted a survey to get some feedback on current usage of DBO and potential aspects which we think would benefit most from improvements. We would like to request a little of your time over the coming festive days, to undertake the survey and provide us with your thoughts on how DBO can be improved for its users. The deadline for taking the survey is 15th of January 2016. The survey can be found at Many thanks and Merry Christmas, Monika Solanki On behalf of the DBO committee" "Query problem" "uHi all, I am trying to execute a construct query to dbpedia.org sparql endpoint by using jena. the sparql query is quit long: I got this exception: HttpException: 500 SPARQL Request Failed. I think this exception is related to memory but I am not sure is it the memory of my local server or the server used by dbpedia.org sparql endpoint. also, when I reduce the query to few lines it works OK. Is there any thing can be done to increase the memory used? any help please regards Abduladem S. Aljamel Nottingham Trent University Nottingham , UK uHello, Am 04.08.2010 01:33, schrieb Abduladem Eljamel: Does the query work when you use the form at (You could also test via something like wget -S -O-" "Add your links to DBpedia workflow version 0.3 (help wanted)" "uHi all, there were some discussions meanwhile and we are able to present yet an updated version of the workflow: New is that we will also include SILK link specs now as well as the scripts, which generated the links. The README was updated Please apply to join this effort and email me, if you want write access to the repo and join the linking committee. We are really looking for managers, who help us push this this effort forward. We also need a proposal for the metadata.ttl and how to maintain the links and load and validate them. All the best, Sebastian" "Merging dbpedia with local ontology" "uHi there, I am new here. I have a question. Say, I have a Korean movie star “Bae_Yong_Joon” his dbpeia link is like this On top of this I want to add a lot for the Korean users, (of course with Koreans) So I developed a short testable ‘korean movie’ topic map in which many other properties and values are added. Now, I have to fetch dbpedia rdf pages in real time and combine it (dbpedia real time data) with the one I developed. Now!!! I got stuck here. The question is how can I combine these two? I mean how can I add something on here In my Korean movie topic map, every topic and instances are all identified with the URI. (some I used dbpedia, some I created, if dbpedia doesn’t have) Please help me. Cheers you guys Myungdae Cho Hi there, I am new here. I have a question. Say, I have a Korean movie star “Bae_Yong_Joon” his dbpeia link is like this of this I want to add a lot for the Korean users, (of course with Koreans) So I developed a short testable ‘korean movie’ topic map in which many other properties and values are added. Now, I have to fetch dbpedia rdf pages in real time and combine it (dbpedia real time data) with the one I developed. Now!!! I got stuck here. The question is how can I combine these two? I mean how can I add something on here Korean movie topic map, every topic and instances are all identified with the URI. (some I used dbpedia, some I created, if dbpedia doesn’t have) Please help me. Cheers you guys Myungdae Cho" "GSOC applicant" "u:Hi, I just received the message of rejection into GSOC. However, I would still like to receive assistance to work on my proposed project outside GSOC. I would gladly appreciate it if I am given the dued assistance Thank you *John OkorieGoogle Student AmbassadorRegent University College Of Science And Technology* * MyBlog www.chikaokorie.blogspot.com * :Hi, I just received the message of rejection into GSOC. However, I would still like to receive assistance to work on my proposed project outside GSOC. I would gladly appreciate it if I am given the dued assistance Thank you John Okorie Google Student Ambassador Regent University College Of Science And Technology MyBlog www.chikaokorie.blogspot.com uIf you have questions, go ahead and ask. On Mon, Apr 27, 2015 at 3:53 PM, john okorie < > wrote:" "Some thoughts about cities and buildings" "uI've been poking about a bit and found the following things: (1) The browser at Google's Chrome. Works OK for me in Firefox (2) I'd like to see more definite comments for the items. For instance, I'd like to see something in the definition of 'City' that gives a specific answer to the Tokyo and London question. If this isn't stated somewhere, things are either going to be random or we'll be having edit wars. Personally I'd be willing to put my opinion in there, but I'd like to see some process as to how this gets done. (3) There is no one simple reason for why the city assignments get lost because the infobox mappings are pretty complicated. For instance, places like Manchester NH, NYC and Sao Paulo have an Infobox:Settlement, and the \"City\" designation should be triggered by settlement_type = City I noticed however, that Manchester has settlement_type = [[City]] which is \"reasonable\" (certainly reflects linked data thinking) but I don't know if the extractor is going to get that. On the other hand, if you look at the entry for Dresden, Dresden has Infobox:German_Location and the \"Citiness\" of Dresden is triggered by the line Art = City in the infobox. There's also an Infobox for Japanese_City, so I'm sure that there are a lot of details. (4) If there's a root cause for the problem, it's that there isn't a closed feedback loop. If you're looking at this as a problem of \"transforming something from form A to form B\" it's clear that the system produces \"B\". It's only when you actually try to use \"B\" that you find that \"B\" is full of holes. Overall it's a system problem: I'm sure that we can get better results by changing the extractor rules (in fact, we'll get the fastest gains this way) but that some changes to Wikipedia content be necessary too. The complexity of the infoboxes means that an agent that does these corrections could be a bit complex, although its behavior could probably be controlled by the infobox mappings. For instance, it may end up doing something a bit different for a \"German Location\" than it would for a \"Settlement\". Along the way it's also tempting to do some canonicalization. For instance, the word \"City\" in the infobox header for NYC is just plain text, but the word \"City\" for Manchester NH is a hyperlink. You can make a case for both, but from a quality standpoint, the same thing should be done in both cases. In the case of the German locations I see that the English words \"Town\" and \"City\" are often used in the \"type\" and \"art\" fields, but the word \"Stadt\" is treated by the framework as if were synonymous with \"Town\", which, from what little German I know, isn't quite right (isn't Munich a /Großstadt/?) . But perhaps the word \"Stadt\" has some special semantics in the context of Wikipedia, and it ought to be preserved" "Concept Identifiers" "uHello, How are concept IDs handled for DBpedia? It looks like the concept URIs are descriptive (i.e. for the concept the concept ID is \"Solar_System\"). Are the descriptive IDs used throughout all of dbpedia (back and front end) or are terms ultimately kept unique by using numeric identifiers? I've been developing a controlled vocabulary and I would also like to use URIs so that my terms can be used with other linked data schemes. My group and I have had a lot of discussions regarding the concept IDs; some want them to be descriptive, based on the preferred term for each concept so that they are human readable but this could cause problems if the terms used to describe each concept change over time, others want them to be randomly generated so that if the description of a term drifts over time the URI for the concept will always remain static. We are trying to figure out if there are any standards or best practices we should be looking towards when it comes to concept IDs. Any thoughts/comments/justifications would be appreciated. Best, Katie uDear Katie, DBpedia mostly uses descriptive URIs that are based on the titles of Wikipedia articles in a specific language. These URIs change if pages are renamed, but for many concepts, this does not occur so often. You would probably only notice it if you are using the URIs for several years. If you instead want to use numeric IDs based on Wikipedia pages (or DBpedia URIs), you can take them from Wikidata. These IDs are stable, but not descriptive. They are kept unique in that they can only be deleted but not reused. For example, The Wikidata URI uses content negotiation to redirect you to the HTML page if you open it in a browser, and to RDF if you open it with an RDF crawler. See direct links to the RDF content. To manually find out what the Wikidata ID is for a Wikipedia page, you can go to the Wikipedia page and use the link to \"Wikidata item\" on the left. To do this in an automated fashion, you can use the SPARQL endpoint, e.g., with the query SELECT * WHERE { schema:about ?item . } (try it in the Wikidata SPARQL UI: The Wikidata Web API can also map page titles to IDs for you prefer JSON over SPARQL: Each of these methods can also be used to fetch many IDs at once. So basically it is fairly straightforward to translate from DBpedia URIs to Wikidata URIs. The mapping between the two changes over time only when DBPedia URIs change their meaning (e.g., if \"Solar System\" is renamed to \"Solar System (astronomy)\" or something). Best regards, Markus On 26.05.2016 20:43, Katie Frey wrote: uHello Katie On Thu, May 26, 2016 at 9:43 PM, Katie Frey < > wrote: DBpedia doesn't have the concept of IDs we have the whole IRI as the concept ID and yes, it is used throughout all of DBpedia This depends on how granular you want your concepts to be. if the description changes over time, does it also means that the concept slightly changed? In DBpedia whenever a Wikipedia page is renamed, the DBpedia IRIs are adapted as well but we also create a link from the old page to the new one with e.g. dbr:OldPageTitle dbo:redirects dbr:NewPageTitle For some cases this tracking is important for others not, this is up for you to decide We are trying to figure out if there are any standards or best practices we It is not clear if you are trying to re-use existing concepts (e.g. DBpedia, Wikidata, etc) or create your own. If you are trying to create your own, Freebase had a nice model, similar to DBpedia but without descriptive IDs (cannot recall which paper documented that) If you want to reuse existing concepts there are different arguments on choosing a KB to base on Best, Dimitris uDear Dimitris, We are creating our own concepts for the Unified Astronomy Thesaurus ( community standards or best practices for using numeric IDs vs. descriptive IDs. Or, if there are no best practices, then what is the reason DBpedia is using descriptive IRIs (sounds like dbpedia uses descriptive because thats what wikipedia uses, fair enough!). You did raise an interesting question regarding the granularity and how the change of a description might also signify a change in the understanding of the concept. We will have to think on this! Best, Katie uDear Markus, Thank you for the insight. We might also try to assign both numeric and descriptive IDs to a concept. It seems as though best practices don't really exist in this area, other than the general imperative to keep the URIs simple and as stable as possible. Best, Katie uClarifying: The function of an ID (identifier) is to identity a concept, not to describe it. So they must be unique by definition, a concept can not have multiple IDs . Multiple concepts may be equivalent to one another, but that doesn't mean they are the same. If you have two different IDs (i.e. the Unicode comparison of the strings results in mismatch) you have by definition two different concepts. That is not what you want. You want multiple labels for the same concept/ID. The description of a concept (or definition, philosophically speaking) is the whole group of other properties you ascribe to the concept by referencing its ID, including as may human readable labels as you need and relations to other concepts. You can choose to use an automatic generated ID in order to guarantee uniqueness or use manually ascribed human readable strings, but with the later you must guarantee the uniqueness by process (as is the case with DBpedia IDs, which are derived from Wikipedia IDs). It is not a RDF requirement for the IRIs to be simple (although that saves parsing time) or human readable, but they MUST be unique and stable, otherwise the identity of the concept is compromised. Human readable IDs are only helpful in manual edition of files, which happens only in examples and didactic purposes. Real world IDs are mostly UUIDs (universally unique IDs) generated by the system (for Java, see java.util.UUID ). Some systems use prefixed URLs in order to embed provenance into the ID, but that is a very, very bad practice: provenance is metadata as any other, you should use specific properties for that. IDs were not designed to provide any other semantics besides identity. Cheers. Marcelo Jaccoud Amaral PETROBRAS Tecnologia da Informação e Comunicações - Arquitetura Tecnológica (TIC/ARQTIC/AT) dum loquimur, fugetir invida aetas: carpe diem, quam minimum credula postero. uHi Katie. I don't think there are universally agreed best practices in this space and people often have strongly held views on either side. You don't mention internationalization/localization which is, in my experience, a bigger concern for folks than semantic drift. Those who believe in numeric identifiers often think that using identifiers in a given natural language provides that language an undeserved pride of place and priority over other languages. Folks in this camp include the creators of CIDOC and there are people dismayed by BibFrame's abandonment of MARC-style numbers. sensible in the abstract, suffer from the weak tools that we have, so end up disadvantaging everyone equally, but everyone more than English identifiers probably would. Your note implies that concept URIs could change over time if they had natural language words as part of the URI. I don't think this would be a good practice. If UAT:Black now means \"orange,\" I think you need to either live with UAT:Black as the URI, mint a synonym UAT:Orange (and keep UAT:Black), or deprecate UAT:Black as a valid concept and create a new concept UAT:Orange. Which course of action is most appropriate will depend on the specific circumstances of a change. If you decide there's a new concept UAT:DarkGrey, that is split off from UAT:Black, perhaps the original can exist unchanged, but if you decide that there's really no such thing as \"black\" but just UAT:DarkGrey and UAT:DarkestGrey, then perhaps UAT:Black gets deprecated and removed. Changing the pieces of URI to UAT101, UAT102, UAT301, etc doesn't really affect most of the discussion. The only case it makes easier is avoid UAT:Black having a description of \"vibrant orange,\" if the concept drifts far enough from its original label (which is embedded in the URI). Since Dimitris mentioned Freebase, briefly what they did was initially mint English language URIs based on the label of the topic, but eventually abandoned the practice because it was too difficult to do automatically and added too little value. They did keep English identifiers for types & properties which were part of the scheme, but these were hand assigned and provided a useful organizing function to group properties with the associated type, types with their domain, etc. A powerful feature of the Freebase setup was that a single topic could have arbitrarily many URIs, so dereferencing /en/Boston, /authority/viaf/1234, /authority/loc/lcnam/nm1234, /wikipedia/en_title/Boston (city), etc could all fetch the same the same content (without the use of redirects). The core identifiers for non-schema topics were machine generated sequential IDs encoded with a compact base 37(?) encoding, e.g. /m/0d_23 Tom p.s. I'm a couple of blocks away if you want to chat about this stuff some time. On Thu, May 26, 2016 at 2:43 PM, Katie Frey < > wrote: uRe \"a concept cannot have multiple IDs\": There seems to be some major misconception underlying the below email. It is certainly not the case that identifiers must be unique by definition. Every identifier must identify a unique object, but a single object can have multiple identifiers. This is standard meaning of \"identifier\" applies to URIs just as well as to IDs in databases and all kinds of other areas. Requiring that every concept can only be identified in one way would make things very hard in decentralised data ecosystems such as LOD. In fact, it would already cause problems in individual databases, which often need to merge identifiers as they improve over time (e.g., MusicBrainz Ids 2ca98ac1-62f0-4cdc-89ad-9d4b7602440a and 6a10f49d-e7db-40d3-a348-bbd5717ebbbe refer to the same concept). Cheers, Markus On 31.05.2016 17:53, wrote: uHi Tom, My impression is that almost all ids in almost all datasets are opaque for reasons that are not so much related to language-neutrality concerns (but I guess it has been a relevant point in some efforts, especially when French and English people collaborate ;-). Named IDs work best on closed domains where names/labels change very rarely. Names of places are maybe the best example, since you can use their relative location to make them unique, e.g., the dmoz id for Dresden is: \"Regional/Europe/Germany/States/Saxony/Localities/Dresden/\" This is a domain where named IDs work really well. Several scientific IDs also are great in this respect. Other domains are not so easy to handle but are still working fairly well. Humans are a good example: they change their names relatively rarely, but they cannot be identified by the name alone. You need to add something else to achieve uniqueness of the ID. For example, Tim Berners-Lee's ID as a Fellow of the Royal Society is \"timothy-berners-lee-11074\". So this would be a mixed approach. For other domains, named IDs are doable but not nice (not even for humans). Movies and books are typical examples of areas where labels are quite complicated to work with (they are not unique, they can be very long, and they can have all kinds of weird symbols and markup). This is probably why the vast majority of databases and catalogues in such areas are using opaque, numeric IDs. And then, finally, there is the big class of cross-domain data where labels can have complicated forms, clash often, and change all the time. Wikipedia, Freebase, and Wikidata are dealing with this. Here it is rather tricky to maintain stable named IDs, and indeed Wikipedia does not work very well as an ID provider. Reuse of the same IDs for a variety of things is the main problem here. I like the approach you had in Freebase, using opaque IDs for stability but supporting additional (possibly domain-specific) IDs to ease use. Some databases already have something similar, e.g., TED has two speaker IDs for Tim Berners-Lee: If I would implement a system like the one Katie might have in mind, I would still use the opaque IDs and then display nice labels to the user within my tool. A properly formatted label is always better than the best of URIs when you want to show it to a user (\"ted:tim_berners_lee\" vs. \"Tim Berners-Lee\"). Raw RDF tools cannot be expected to provide this level of (end-)user friendliness, but many of the higher-level tools I am seeing today are working with labels (in multiple languages) without big problems. Best regards, Markus On 31.05.2016 18:12, Tom Morris wrote: uHi Katie, you can also use You will get the Wikidata data plus the proper links to DBpedia data, which is extracted from the article and enriched with taxonomies and links. We also do a lot of quality control. see this paper In the future, we are considering propagating the Wikidata IDs as the main IDs. From my experience: If you have a more static set of Ids, e.g. for planets of the Solar system, chemical elements, countires of the european union, descriptive IDs are fine. There might be some change, but it is minimal. For highly dynamic entities, e.g. names and events and opinions extracted from text you should definitely use UUIDs Anything in between can be a mixture, e.g. like an md5 hash over the most relevant properties. All the best, Sebastian On 31.05.2016 16:47, Katie Frey wrote: uOn 01.06.2016 10:02, Markus Kroetzsch wrote: uOn 1 June 2016 at 11:02, Markus Kroetzsch < > wrote: Indeed, concepts (and other things) may have multiple IDs. This is the case in the real world (e.g. different national libraries have their own identifiers for the same person) and it would be wrong not to be able to represent this in information systems (e.g. as Linked Data). OWL does not make a unique name (ID) assumption either - a reasoner might conclude that two resources (with different URIs) are the same. An example from the Library of Congress: the concept for Linked Data has 3 different URIs - probably for historical reasons. Other data sources (DBPedia, Wikidata, other libraries) may have their own identifiers for this concept. Cheers, Uldis On 31.05.2016 17:53, wrote: uOn 01.06.2016 10:46, Sebastian Hellmann wrote: The UNA is a principle in formal logic and knowledge representation. It is not really related to this discussion. For example, standard DBMS all make the UNA, but you can still have many identifiers (keys) for the same object in a database. The explanation is that UNA refers to the internal interpretation of symbols in the database, whereas the outside, user-level notions of \"identifier\" and \"concept\" may be represented in a database by many symbols and their relationships. Markus uHi Markus, On 01.06.2016 12:58, Markus Kroetzsch wrote: Then the database does not use UNA. The above sentence reads like you could have two primary keys, but then still have them pointing to the same row. UNA means, if you have two identifiers A, B you add a triple A owl:differentFrom B at all times. Sebastian uHi Sebastian, On 01.06.2016 13:07, Sebastian Hellmann wrote: I don't think that this mixing of different notions is making much sense. Every SPARQL processor under simple semantics makes the UNA, while RDF and OWL entailment regimes for SPARQL do not make it. This has nothing to do with how you model concepts and their IDs in your domain. You can have the same data and use it in different SPARQL tools, sometimes with a UNA sometimes without, but your choice of modelling identifiers is not affected by that. Markus uHi Markus, On 01.06.2016 14:49, Markus Kroetzsch wrote: Makes totally sense to me, since they are all quite similar. Entity Relationship Diagram are similar to Onologies/RDF, SPARQL is often implemented using Relational Databases. The relational model Codd is consistent with first-order predicate logic as are many description logics, in particular a less expressive fragment was used to design OWL What is simple SEMANTiCS? Primary key in SPARQL stores backed with relational db's often have the Quad {?g {?s ?p ?o}}as the primary key. De facto, UNA produces contradictions as soon as you want to state that to things are the same. So owl:sameAs would not make sense combined with UNA as it would always cause contradictions, except in the reflexive case. Just because you are not unifying merging identifiers right away does not imply UNA. there are SPARQL tools that throw a contradiction, if they encounter owl:sameAs OWL was designed to handle multiple identifiers. This affects the modeling in a way that it is fine to have several IDs. DBpedia as such uses this. Below are all ID's for DBpedia Berlin., where the first one is the canonical one. A good idea might be to provide as well in the future. We are working on a service that allows to canonicalize all DBpedia Ids, which is only legit as there is no UNA intended in OWL. All the best, Sebastian uÿØÿà uHi Sebastian, I'll try to clarify further. This really is a tricky topic and maybe more than an email thread is needed to explain this. If you want to dive into the details, you may want to check out some textbooks to get started (Abiteboul et al. would be the standard intro to database theory and Relational Algebra; for FOL there are many choices, but there is no DL-specific textbook; there are some good DL tutorials, however, that may be useful). I don't know of a good reference that explains the differences that are causing confusion here. On 01.06.2016 17:03, Sebastian Hellmann wrote: Sorry, but you are mixing up things again here. Being \"similar\" is not enough to establish a logical relationship between two formalisms. Eve the underlying logic (FOL here) is just one aspect. OWL semantics is based on *entailment* of logical consequences in FOL. In contrast, Relational Algebra is based on *model checking* with respect to finite FOL models. The two tasks are totally and fundamentally different (model checking is PSpace complete, entailment checking is undecidable, for a start). It's beyond this thread to explain all details relevant here, and the somewhat vague notion of \"UNA\" does not really do it justice either (UNA is really a property of a logic's model theory, but does not tell you whether you are doing model checking or entailment). \"Simple semantics\" is the most basic way of interpreting RDF graphs. If you would like to know more, then you could start with the spec: Most SPARQL processors do not go beyond this, though their semantics is specified differently (based on model checking rather than on entailment, which makes it more natural to talk about, e.g., negation and aggregates). Nevertheless, the simple semantics is kind of built into the SPARQL BGP semantics already, so you cannot do anything less if you implement SPARQL. I cannot make sense of these sentences. UNA is a property of the semantics you use, which in turn is determined by the tool (reasoner) you apply. You cannot \"imply UNA\" uHi, Interesting discussion. One point that I do not think has been evaluated here is the time, location and context components of a statement. The assignment of an identifier is simply a statement made about a concept that instantiates it as an entity. That statement is bound to a time, location and context and the notions of persistence and uniqueness are interpretations from that singular point of view. In a modal reality, an infinite set of identifiers exists for any given concept. The imposition of a uniqueness constraint on an identity is decidable only within a well-known (and highly administered) domain. CIDOC has a property called \"P48 has preferred identifier\" with an owl:Restriction maxCardinality of \"1\". This seems to be a reasonable solution to the question of identifiers. Many identifiers can exist (and usually do), but there can be only one \"preferred identifier\" for a resource in a given ontology. Cheers, Christopher On 1 June 2016 at 21:53, Markus Kroetzsch < > wrote: uHi Christopher, that is a good tipp. So in the future we would let them point to the wkdb (wikidatadbpedia) identifiers: cidoc:preferredIdentifier . what is the exact URL? and what is the difference to rdfs:definedBy Sebastian On 02.06.2016 04:43, Christopher Johnson wrote: uHi Sebastian, The OWL 1.0 implementation of CIDOC exists as the Erlangen CRM. The ecrm documentation about P48 can be located here . The scope notes of CIDOC are informative about the interpretation of expected usage. rdfs:definedBy is (a self-referencing pointer) to an rdf vocabulary that would properly be used in a formal rdfs class/property schema definition like dcterms . (z.B. dcterms:TGN rdfs:definedBy ) I favor not using rdfs: properties for normal data graphs, because (in theory) rdfs: should be reserved for use only in a schema. But, rdfs:label is an example of a convention for literals that is probably not so well-conceived (at least in DBpedia and Wikidata). dcterms:title seems better suited for this purpose, but this is another topic entirely! VG, Christopher On 2 June 2016 at 14:48, Sebastian Hellmann < > wrote:" "DBpedia Lookup Service back online!" "uHi all, the DBpedia Lookup Service is back online. We sincerely apologize for the interruption. The DBpedia Lookup Service can be used to look up DBpedia URIs by related keywords. Related means that either the label of a resource matches, or an anchor text that was frequently used in Wikipedia to refer to a specific resource matches (for example the resource “USA”). The results are ranked by the number of Wikipedia page inlinks. Two APIs are offered: Keyword Search and Prefix Search. The URL has the form The Keyword Search API can be used to find related DBpedia resources for a given string. The string may consist of a single or multiple words. Example: Hits=5&QueryString;=berlin 2. Prefix Search (i.e. Autocomplete) The Prefix Search API can be used to implement autocomplete input boxes. For a given partial keyword like berl the API returns URIs of related DBpedia resources like Example: &QueryString;=berl Parameters The three parameters are Query String: a string for which a DBpedia URI should be found. Query Class: a DBpedia class from the Ontology that the results should have (for owl#Thing and untyped resource, leave this parameter empty). Max Hits: the maximum number of returned results (default: 5) Results The service returns results in XML format. The rresults comprise of the following: URI Label (Short) Description Classes (URI & label) Categories (URI & label) Refcount (number of Wikipedia page inlinks) Search Index DBpedia Lookup is relies on a Lucene index that has been build based on the October / November 2010 Wikipedia dumps (DBpedia 3.6). The documentation of the Dbpedia Lookup Service is also found at Thanks a lot to Max Jakob (Freie Universität Berlin) for updating the index behind the service and getting the service back online! Have fun using DBpedia Lookup! Cheers, Chris uSorry if i ask, but is there any wsdl? The previous lookup had one and i bult a java application that used the lookup service as a webservice generating the code with axis from the wsdl. Please tell me i don't have to redo this from scratchPiero Il giorno 21/gen/2011, alle ore 10.11, Chris Bizer ha scritto: uThe WSDL 2.0 description can now be found at I added a link to the Lookup homepage. Best, Max On Sat, Jan 22, 2011 at 17:46, Piero Molino < > wrote:" "Updation of Wiki Page" "uOn Jun 6, 2014, at 8:59 PM, Arvind Iyer < > wrote: Greetings from INDIA. The above mentioned URL contains Wikipedia information source from an old page that is both irrelevant and outdated. Might it be possible to update the dbpedia page with just what exists currently on Wikipedia as it is both Relevant and Live. Thank You very much warm regards Arvind Iyer Mumbai,INDIA u" "dbpedia ontology" "uHi every one I want to understand how many properties, sub-classes  and classes are related to a property of an resource in dbpedia dataset (access to dbpedia ontology). In other words, navigation between the properties and concepts by using Vocabulary Links. How can I do it? Any help would be greatly appreciated. Hi every one I want to understand how many properties, sub-classes and classes are related to a property of an resource in dbpedia dataset (access to dbpedia ontology). In other words, navigation between the properties and concepts by using Vocabulary Links. How can I do it? Any help would be greatly appreciated." "Reference facts in DBpedia" "uHello all ! Vayu is a god of the wind. In the Wikipedia infobox, this fact has a reference. References are stated between and tags. Here is an excerpt from the infobox: | God_of = the wind The Book of Hindu Imagery p. 68 The problem is that dbpedia thinks Vayu is a god of the reference, too : SELECT ?topic WHERE { dbpedia2:godOf ?topic } Returns: \"the wind The Book of Hindu Imagery p. 68\"@en 1) We should filter out the reference. 2) It would be nice for DBpedia to know this: the fact \"Vayu is a god of the wind\" is asserted by \"The Book of Hindu Imagery p. 68\". This would mean creating five triples, see That's it, just wanted to let people know, hopefully someone will have time to work on it, probably around line 70 of InfoboxExtractor.php, I will try when I have time :-) Cheers, Nicolas Raoul. http://nrw.free.fr" "triple bug with funny characters in urls" "uhello, there seems to be a bug in handling \"funny characters\" in urls, for an example, see [1]. first, urls that are in fact utf8 encoded, are stored in latin1 representation, which means breaking the url (as urls are byte strings). second, funny characters, once recoded from latin1 (although it was utf8), are then saved using html entities. from my experience with encoding quirks (too much :-) ), this tends to make things worse; in this case, the converted urls are saved in n-triples, where the enties have no meaning again. for example, in [2], there is a line saying . third, some urls are cropped after the first semicolon (which was inferred by the second bug), inferring owl:sameness for distinct resources as shown in [1]. i've only looked at the compiled download datasets, so i can't say anything about where the bugs are located then between wikipedia and that zip files. regards chrysn [1]: [2]: gutenberg.zip" "DBpedia/Linked Data" "uHi, I am studying about ontology and Linked Data and I would like some questions in relation to DBpedia: 1. If I make ontology or triplet store, how to connect it with all DBpedia triplets? 2. Can I establish a connection with DBpedia as you have with Yago or other ontology? Do I have to have web server which makes that connection or do I have to import my ontology in database such as Freebase etc.? 3. If I have my ontology in Protege is it possible to connect it with DBpedia? 4. Do you prefer Freebase, Yago or something else for development of database? Thank you in advance, Ivana Hi, I am studying about ontology and Linked Data and I would like some questions in relation to DBpedia: 1. If I make ontology or triplet store, how to connect it with all DBpedia triplets? 2. Can I establish a connection with DBpedia as you have with Yago or other ontology? Do I have to have web server which makes that connection or do I have to import my ontology in database such as Freebase etc.? 3. If I have my ontology in Protege is it possible to connect it with DBpedia? 4. Do you prefer Freebase, Yago or something else for development of database? Thank you in advance, Ivana #avg_ls_inline_popup { position:absolute; z-index:9999; padding: 0px 0px; margin-left: 0px; margin-top: 0px; width: 240px; overflow: hidden; word-wrap: break-word; color: black; font-size: 10px; text-align: left; line-height: 13px;} uHi, As I understand it, your questions are related to the task of linking different data sets. In general, the creation of links between different data sets (for example, DBpedia and another ontology) can be done with the SILK tool: I hope this answers your questions. Regards, Max On Wed, Jul 28, 2010 at 7:45 AM, Ivana Sarić < > wrote:" "Question about classes/properties returned by queries" "uDear DBpedia people, Congratulations for your work and for your will to put some useful order in the chaos of the internet. I'm a beginner. I've been browsing your website at In the file \"dbpedia_3.9.owl\" I can see the 529 class names and the 2,333 property names. All of them are like this: \" In the file \"mappingbased_properties_en.nt\" (3.4 GB) I have seen only 15 property names which are not \" 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. Nevertheless, if I send to SELECT ?property ?value WHERE { { ?property ?value } } then the result has many classes and properties which are not listed in \"dbpedia_3.9.owl\" and are not used in \"mappingbased_properties_en.nt\". Some examples: Class names: - - Property names: - - - http://www.w3.org/2000/01/rdf-schema#label - http://www.w3.org/2000/01/rdf-schema#comment My question is: Where can I find a list of all the class names and property names which I can expect to receive as a result of a query? One thing I could do is get the datasets available for download (Persondata, RawInfoboxProperties, etc) and make a list of all the different classes and properties present in those datasets. Does that make sense? Perhaps it makes sense for the properties, but what about the other class names? Regards, Juan Dear DBpedia people, Congratulations for your work and for your will to put some useful order in the chaos of the internet. I'm a beginner. I've been browsing your website at http://dbpedia.org and I have a basic doubt. In the file \"dbpedia_3.9.owl\" I can see the 529 class names and the 2,333 property names. All of them are like this: \" http://dbpedia.org/ontology/ *\" In the file \"mappingbased_properties_en.nt\" (3.4 GB) I have seen only 15 property names which are not \" http://dbpedia.org/ontology/ *\": 1. < http://xmlns.com/foaf/0.1/familyName > 2. < http://www.w3.org/2003/01/geo/wgs84_pos#lat > 3. < http://xmlns.com/foaf/0.1/page > 4. < http://xmlns.com/foaf/0.1/thumbnail > 5. < http://www.w3.org/2003/01/geo/wgs84_pos#long > 6. < http://xmlns.com/foaf/0.1/name > 7. < http://xmlns.com/foaf/0.1/givenName > 8. < http://xmlns.com/foaf/0.1/homepage > 9. < http://www.georss.org/georss/point > 10. < http://xmlns.com/foaf/0.1/logo > 11. < http://purl.org/dc/elements/1.1/description > 12. < http://xmlns.com/foaf/0.1/depiction > 13. < http://xmlns.com/foaf/0.1/nick > 14. < http://www.w3.org/1999/02/22-rdf-syntax-ns#type > 15. < http://www.w3.org/2004/02/skos/core#subject > Nevertheless, if I send to http://dbpedia.org/sparql a query like this: SELECT ?property ?value WHERE { { < http://dbpedia.org/resource/Sagunto > ?property ?value } } then the result has many classes and properties which are not listed in \"dbpedia_3.9.owl\" and are not used in \"mappingbased_properties_en.nt\". Some examples: Class names: - http://schema.org/Place - http://dbpedia.org/class/yago/Location100027167 Property names: - http://dbpedia.org/property/settlementType - http://www.w3.org/2003/01/geo/wgs84_pos#geometry - http://www.w3.org/2000/01/rdf-schema#label - http://www.w3.org/2000/01/rdf-schema#comment My question is: Where can I find a list of all the class names and property names which I can expect to receive as a result of a query? One thing I could do is get the datasets available for download (Persondata, RawInfoboxProperties, etc) and make a list of all the different classes and properties present in those datasets. Does that make sense? Perhaps it makes sense for the properties, but what about the other class names? Regards, Juan" "Password recovery problem on mappings wiki." "uHello, I am trying to recover my password on the mappings wiki: I have requested the password yesterday evening but I am not getting the e-mail with the reset instructions, and the limit on password resets (one every 24 h) is *extremely annoying*. What can I do? My username is: CristianCantoro Thanks in advance for your help. Cristian" "Collaboration talks Wikidata - (Dutch) DBpedia" "uHi all, I don't know whether this is the right mailing list, but here is my report on the talks I had with two Dutch Wikidata people. Would you agree with the proposal we worked out? Regards, Gerard A common project for Wikidata and DBpedia Gerard Kuys, February 12, 2014 At the Dutch Chapter meeting at the 1st DBpedia Community Meeting in Amsterdam, we decided to embark on a trajectory of closer collaboration with the Wikidata project. The idea was to compare datasets and think of ways to mutually improve data quality by way of comparing one dataset to another. The monuments dataset was suggested to be one of the candidate datasets. In order to determine further steps to be taken in this collaboration effort, Gerard Meijssen of Wikidata contacted the GLAM ‘liaison officer’ of the Dutch Wikimedia Foundation, Sebastiaan ter Burg. With Sebastiaan and with Hay Kranen, Wikipedian in residence at the Royal Library of the Netherlands, I had a fruitful conversation on Tuesday, March 11. On the path towards better quality of data, both Wikidata and DBpedia encounter obstacles to be cleared away. At Wikidata, Hay has pleaded for including the Dutch (and German?) PPN identifier for books, authors and keywords into the list of external identifiers that is kept within Wikidata. However, this proposal was rejected so far, on the grounds that it was insufficiently clear to the Wikidata project members what exactly this NTA field (as it is known in Wikidata circles) would contribute. At DBpedia, on the other hand, we meet with the problem of registering data about people’s gender, which cannot be extracted from Wikipedia articles due to editors’ policies and has to be obtained by way of linking to external datasets. The major issue to be solved, however, is how to overcome the boundaries between content compartments that spring from institutions’ collections being separately donated or otherwise brought into Wikipedia (either the encyclopedia or Commons). What we need is finding a way of constructing relations between content across domains. By doing so, we probably also would facilitate the feedback loop donating institutions are eagerly waiting for. When trying to settle for the domains that are most fit for comparison between Wikidata and DBpedia, we identified two domains: Writers’ and Monuments’ data. As a first step, we would want to make comparable dumps of data from either source, and work out an approach for finding all kinds of omissions and errors, and mending them. To be overly ambitious, however, as soon as this work has been done, we would want to take the bolder step and link both domains one to another: how could we find the relations (and translate them into RDF(S) properties) expressing the semantic relation between a person (mostly writers) and any building he or she has had a (documented) connection with. Being the GLAM liaison within the Wikimedia Foundation, Sebastiaan is keen to foster this endeavour wherever possible. He will be offering all kinds of support Wikimedia can provide: meeting rooms and the paying of travel expenses. Wikimedia could also provide due publicity, which might be helpful if we would want to attract volunteers who could help monitor data quality wherever there is no automated way to do so. The main work to be done yet remains with the Wikidata and DBpedia communities. We agreed that we would better limit the number of meetings between (working groups within) both communities. Nonetheless, a kick-off meeting would be nice, and useful to have. We think of two types of meetings: 1. * An initial meeting to set up a working approach, identify, divide and attribute work to be done 2. * One or several follow-up meetings to discuss progress and tackle problems that have arisen and cannot possibly be solved by way of skype conferences. This kind of meeting could be held within the framework of ordinary Wikimedia meetings, like the Wiki Saturdays. As soon as both Dutch Wikidata and Dutch DBpedia communities will have approved of this approach (or rather – this is what we hope for), a date for the initial meeting could be fixed. Disclaimer Dit bericht met eventuele bijlagen is vertrouwelijk en uitsluitend bestemd voor de geadresseerde. Indien u niet de bedoelde ontvanger bent, wordt u verzocht de afzender te waarschuwen en dit bericht met eventuele bijlagen direct te verwijderen en/of te vernietigen. Het is niet toegestaan dit bericht en eventuele bijlagen te vermenigvuldigen, door te sturen, openbaar te maken, op te slaan of op andere wijze te gebruiken. Ordina N.V. en/of haar groepsmaatschappijen accepteren geen verantwoordelijkheid of aansprakelijkheid voor schade die voortvloeit uit de inhoud en/of de verzending van dit bericht. This e-mail and any attachments are confidential and are solely intended for the addressee. If you are not the intended recipient, please notify the sender and delete and/or destroy this message and any attachments immediately. It is prohibited to copy, to distribute, to disclose or to use this e-mail and any attachments in any other way. Ordina N.V. and/or its group companies do not accept any responsibility nor liability for any damage resulting from the content of and/or the transmission of this message. P {margin-top:0;margin-bottom:0;} Hi all, I don't know whether this is the right mailing list, but here is my report on the talks I had with two Dutch Wikidata people. Would you agree with the proposal we worked out? Regards, Gerard" "Query based on type" "uHi, I've tried some sample queries with SPARQL. I would want to know, how to write a query which should fetch different properties based on type. For example: With a given \"KEYWORD\", if the KEYWORD is a city name, the query should fetch the country, population, etc., if the KEYWORD is a person name, the same query should fetch the date of birth, Known for, etc., Is it possible to write a query like that?! Even if it is possible, whether it would be efficient? If it is not the right way, then how can I get such type based information? If the question is too simple, please do forgive me. Is there any reference URL to learn such querying(More than the basic queries)? Thanks, Naga Hi, I've tried some sample queries with SPARQL. I would want to know, how to write a query which should fetch different properties based on type. For example: With a given 'KEYWORD', if the KEYWORD is a city name, the query should fetch the country, population, etc., if the KEYWORD is a person name, the same query should fetch the date of birth, Known for, etc., Is it possible to write a query like that?! Even if it is possible, whether it would be efficient? If it is not the right way, then how can I get such type based information? If the question is too simple, please do forgive me. Is there any reference URL to learn such querying(More than the basic queries)? Thanks, Naga uHi Naga, On 09/20/2012 03:07 PM, Naga Hl wrote: If I understand what you want to do correctly, then the following SPARQL query can achieve that (assuming the keyword you are looking for is \"Paris\") SELECT * WHERE { ?s rdfs:label \"Paris\"@en. ?s ?p ?o. } This way of achieving it is efficient, as long as you know the exact keyword, as you can notice when you run it, i.e. it is a quite fast query." "Dbpedia-Freebase raw dump of conditional probabilities" "uTim (and anyone else who is interested) I put the raw dump of conditional probabilities on an external website ( pages/iaa.index.html). Go the section on LinkedOpenData and Extraction of Vocabularies on this page and click on the link to the datafile. Kavitha On Aug 10, 2009, at 4:42 PM, Tim Finin wrote:" "Wikipedia extraction framework" "uHi All, about Wikipedia extraction framework, the input is dumped articles from Wikipedia that i get from the link  plz, i want to know how to get and connect the input to Wikipedia extraction frame work (code). kind regards, amira Ibrahim abd el-atey uAmira, On Fri, 26 Feb 2010, amira Ibrahim abd el-atey wrote: I have used a simple Perl program with bzcat to do some extraction. I have just written a short blog article about it where you can see the Perl program: There are Wikipedia/Wikimedia mailing lists that might be more relevant for you question, e.g., /Finn Finn Aarup Nielsen, DTU Informatics, Denmark Lundbeck Foundation Center for Integrated Molecular Brain Imaging" "dbpedia lookup service" "uHello, I am currently writing a thesis on web tables and I am using the DBPedia lookup service (prefix search) to link string in web tables to DBPedia URIs. I would like to provide information in my thesis on how the lookup works, but I cannot find any information on the web. Are there any papers I could cite that describe the lookup service? If not, I would be grateful, if someone could provide a short summary of what the lookup is based on, e.g. string similarity to labels, page rank of the wikipedia page, etcThanks in advance. Mark uThe DBpedia lookup uses a context independent approach that was last described, if I am not mistaken, on Georgi Kobilarov's paper with the BBC folks. We also describe some of the data that Lookup uses in our LREC'12 paper about DBpedia for NLP datasets. But the best way to know for sure is to take a look at the source code on github. :-) That being said, I think it would be more fair if you compared your solution to DBpedia Spotlight, which has context aware scoring and will use the words in the table to try to guess a URI for each cell. Cheers Pablo On Jan 12, 2014 3:20 AM, \"Mark Reinke\" < > wrote:" "sameAs wikidata help" "uDear all community I'm writing because I think I have a problem with the extraction of the sameAs triples (Interchapter), I have few sameAs triple, for example I have no sameAs for the capital of France : Paris. So I think that the problem could be when I try to extract the wikidata-SameAs (WikidataSameAsExtractor). Because during the extraction many many warning about JSON and other appeared. With the most recent wikidata dump I have no sameAs generated, only the January dump generate something. I put a picture of that here for you (for example) : For this I use the dbpedia extraction framework from de 10th april 2015 and the wikidata-20150330 dump this the command line : /clean-install-run extraction extraction.wikidata.sameas.properties (in dump directory) Nothing is generated without empty files. RedirectFinder Warning : Json warning : Have you this kind of warning ? For information I use the dbpedia extraction framework from de 10th april 2015 and the wikidata-20150330 dump. This is a realy big problem me, this will be great if you have an idea of the problem's origin. And s hould I do something else later to get the interlanguage links ? Thanks in advance With regards Raphael Boyer Inria, France Dear all community I'm writing because I think I have a problem with the extraction of the sameAs triples (Interchapter), I have few sameAs triple, for example I have no sameAs for the capital of France : Paris. So I think that the problem could be when I try to extract the wikidata-SameAs (WikidataSameAsExtractor). Because during the extraction many many warning about JSON and other appeared. With the most recent wikidata dump I have no sameAs generated, only the January dump generate something. I put a picture of that here for you (for example) : For this I use the dbpedia extraction framework from de 10th april 2015 and the wikidata-20150330 dump this the command line : /clean-install-run extraction extraction.wikidata.sameas.properties (in dump directory) Nothing is generated without empty files. RedirectFinder Warning : France uHi Raphael, There is a problem with the latest wikidata xml dumps and we have a pending pull request that bypasses wrongly exported wikidata entities (as well as other wikidata RDF export dumps) If you are in a hurry checkout the branch to be merged, it's quite stable and needs only a few tweaks not related to sameAs links Cheers, DImitris On Mon, Apr 20, 2015 at 11:59 AM, Raphael Boyer < > wrote: uHi Dimitris Thanks a lot for your response. I'm trying it, this seems to be hopeful. I'll send another mail about my other problems Cheers, Raphael uHi Dimitris, Just for information, this generated many of sameAs links, but sometimes they takes us on a 404 error. Like this : * * * * * * Cheers, Raphael uThe reason is that not all chapters exist. The same applies for all the language dumps we create On Thu, Apr 23, 2015 at 12:02 PM, Raphael Boyer < > wrote:" "Small patch to dbpedia lookup" "uHiya, Not sure if this is the correct place for development discussions or not so please correct me if I'm wrong. I had a bit of trouble building the dbpedia lookup tool as maven started to complain about not being able to find the nxparser jar despite it being in the lib directory. I have a small patch that upgrades the nxparser version and sets maven to pull in the jar from the project's maven repo on google code: What is the best way to submit this via Sourceforge? I'm also quite interested in trying to add some new functionality to the lookup tool, geospatial search in particular. To do this I first need to be able to build a new index though! I was wondering if anybody had any code or advice for building the surface forms? I have a simple bash (sed + perl) script that parses the links in the wikipedia dumps and seems to work OK, but the parsing is fairly simplistic when it comes to some of the wiki syntax. Would be very interested to know how the original surface forms were built for the current lookup index. Many thanks, Matt Hiya, Not sure if this is the correct place for development discussions or not so please correct me if I'm wrong. I had a bit of trouble building the dbpedia lookup tool as maven started to complain about not being able to find the nxparser jar despite it being in the lib directory. I have a small patch that upgrades the nxparser version and sets maven to pull in the jar from the project's maven repo on google code: Matt uHi Matt, Thanks for your message. We usually move the dev-intensive discussions to dbpedia-developers. Please feel free to subscribe and send us a patch there. The current implementation also seems to have a bug that makes it crash from time to time. I think there are some chars that makes it throw a NullPointerException. If you're interested in taking a look at that, we'd definitely appreciate. About the surface forms, we used DBpedia Spotlight's indexing process to obtain those. What you see in the lookup index is a subset of: The class used to generate the data is available here: Cheers, Pablo On Sun, Nov 18, 2012 at 7:26 PM, Matthew Haynes < > wrote:" "Airpedia vs DBpedia entity counts" "uI've looking at an analysis of the Airpedia entity types and I have a question about how things are counted between DBpedia and Airpedia. If I look at the DBpedia stats films in EN wikipedia. If I count the Airpedia films it has 88,997 with a confidence breakdown of: 67613 I've looking at an analysis of the Airpedia entity types and I have a question about how things are counted between DBpedia and Airpedia. If I look at the DBpedia stats Tom" "Problems with URI access?" "uHi For the last couple of days I've had trouble using basic uri-based access to DBpedia, either through a browser or a simple Squeak client. Both were working fine in December. For example, trying to load the following page gives a truncated response. In fact it seems that every DBpedia response is being truncated to a multiple of 4096 bytes. This situation is happening from my workplace, which does sit behind some fairly heavy firewall mechanismsbut, as I say, it used to work. By contrast, through my home Internet connection the Web browsers (IE7 and Firefox) get full, uncorrupted responses. Squeak was still getting truncated ones until I made it put \"HTTP/1.1\" into its requests rather than \"HTTP/1.0\". That seemed to solve the problem - but only at home. I see that there has been a server change recently, as well as the upgrade to 3.0. Possibly related? Thanks - Aran uAran, Looks like a bug in OpenLink Virtuoso, which is now the front-end web server (instead of Apache) sitting in front of the Java webapp that generates the HTML pages. The bug can be easily reproduced: curl -0 which gives a truncated result. The -0 argument forces HTTP/1.0. Kingsley, any chance for a quick bugfix? Cheers, Richard On 19 Feb 2008, at 08:11, Aran Lunzer wrote: uRichard Cyganiak wrote: Yes. Zdravko: Please resolve ASAP. Richard: Was this the case prior to last week's move to DBpedia 3.0? Kingsley uOn 19 Feb 2008, at 13:28, Kingsley Idehen wrote: I'm not sure. I *think* this worked fine on the old version that ran on openlink.dbpedia.org before the DNS switchover (the one which had the 20-second delay problem for persistent HTTP connections). But I might be wrong. Richard" "Can not download Mapping_id.xml" "uHi All, I am trying to update all mapping in directory mappings.  I run this command  /run download-mappings I hope I get Mapping_id.xml after do it, but there is no Mapping_id.xml until downloading process finished. What should i do?   Regards, Riko Hi All, I am trying to update all mapping in directory mappings. I run this command /run download-mappings I hope I get Mapping_id.xml after do it, but there is no Mapping_id.xml until downloading process finished. What should i do? Regards, Riko uStrange because I just didDid you update to the latest code? Try $git pull $/clean-install-run download-mappings On Fri, Mar 15, 2013 at 9:33 AM, Riko Adi Prasetya < >wrote: uHi Riko, On 03/15/2013 08:33 AM, Riko Adi Prasetya wrote: I just run the command, and it has successfully downloaded the mappings. Are you sure you run it from the \"core\" folder?" "Announcing OpenLink Virtuoso, Open Source Edition, v6.1.4" "uHi, OpenLink Software is pleased to announce the official release of Virtuoso Open-Source Edition, Version 6.1.4: New product features as of October 31, 2011, V6.1.4, include: Oct 31, 2011, V6.1.4: * Upgrading from previous versions - Added information about upgrading from previous 6.1.x to 6.1.4 - Enabled check for bad index due to XML fragment See: README.UPGRADE * Database engine - Added new implementation of search_excerpt that can handle both ANSI/UTF8 and Wide strings - Added new setting RdfFreeTextRulesSize - Added improved support for inference rules based reasoning and materialized Linked Data Views generated from ODBC/JDBC accessible relational databases - Added option to register post-processing callbacks to SPARQL DESCRIBE - Added initial support for PHP 5.3 runtime hosting - Added aggregate DB.DBA.GROUP_DIGEST which makes it possible to return part of large output from DB.DBA.GROUP_CONCAT without running out of row length limits - Added optimised codegen for built-in aggregate functions - Added option to enable/disable ?P statistics generation re. SPARQL query patterns - Added support for HTML5+Microformat, Microformat/JSON and JSON-LD serialization formats re. SPARQL endpoint - Added support for SPARQL 1.1 IF and COALESCE - Added support for SPARQL 1.1 SPARQL HTTP Graph Store Protocol covering Graph level CRUD operations - Added support for SQL QUERY syntax in declaration of Linked Data Views - Added support for calling XPath/XQuery functions from SPARQL - Fixed code generation using gawk 4 - Fixed code generation for service invocation for case of IN parameter that is not bound in SINV sub-query is neither external/global nor fixed in parent group pattern - Fixed col_default to be same dtp as col_dtp to prevent default value misuse - Fixed compiler warnings - Fixed connection leak in connection pool during long checkpoints - Fixed crash running FILTER query containing IN clause with only one item in it - Fixed deadlock on attempt of qr_recompile during the run of SPARQL-to_SQL front-end - Fixed disable dep cols check - Fixed disabled pg_check_map by default to make checkpoint faster - Fixed handling of GROUP BY and ORDER BY using expressions - Fixed hang or crash after checkpoint is finished - Fixed issues with cost based optimizer - Fixed issue with multiple transitive subqueries in sql optimizer - Fixed issue with ORDER BY expression optimization - Fixed JSON output for native parsers - Fixed key dep cols check for sample - Fixed lock status report - Fixed memory leaks - Fixed possible mutex deadlock - Fixed problems re-creating quad map - Fixed rdfview generation - Fixed recompile all qr's cached on cli connection when dropping a group or creating new graph group - Fixed set sl_owner before cpt_rollback in order to know which thread owns the process, otherwise other threads may wrongly go inside the wait_checkpoint - Fixed skip rules which perform http redirect when doing a POST - Fixed space calculation when changed records does not fit in available space on page - Fixed SPARQL OPTIONAL keyword sometimes causing queries to not return graph matches - Fixed SQL codegen bug in SPARQL queries of R2RML rewriter - Fixed when iri exceeds 2KB limit and flag is enabled then shorten the iri, instead of rejecting it - Rebuilt Jena, Sesame2, and JDBC drivers - Updated documentation * SPARQL and RDF - Added new cartridges for Eventbrite, Eventful, Foursquare, Gowalla, Google+, Google Places, Google Product, Google Profile, Gowalla, Guardian, Hyperpublic, Jigsaw, LinkedIn, Plancast, ProgrammableWeb, Seatgeek, Seevl, SimpleGeo, Upcoming, XRD, Zappos, and Zoopla - Added ontologies for OpenLink CV/Resume, Google+, and many others associated with Sponger Cartridges - Added new cartridge for Twitter using Twitter 2.0 REST API - Added enhancements to Facebook Graph API and OpenGraph based cartridges - Added in-built support for social bookmarking to Facted Browser and Sponger generated Linked Data pages - Added new HTML base User Interface for default SPARQL endpoint - Added support for MS-Author-Via: SPARQL, to SPARQL response headers when using SPARQL endpoint - Added support conditional operators such as: like, =, , > ranges, and IN re., native Faceted Browser pages - Added improved permalinks functionality Faceted Browser pages - Added support for javascript-like hrefs in RDFa - Added w3-1999-xhtml/vocab for RDFa 1.1 - Added HTTP status codes in SPARQL graph store protocol - Added API for selective sponging via URL enhanced patterns - Added support for CREATE LITERAL CLASS \"format string\" - Fixed bad conversion of utf8 in rdf/xml - Fixed \"delayed\" filters like ?x p1 ?o1 ; p2 ?o2 . optional { } . filter (?o1 = ?o2) - Fixed map OpenLink Zillow ontology to geo:lat/long - Fixed map oplog:likes_XXX property to like:likes - Fixed minor issues - Fixed SPARUL LOAD INTO command creating duplicate graphs - Fixed translation from nodeID://xxx to _:xxx - Fixed url encoding issues in RDF/XML - Fixed when dropping a graph, also check if there is a quad map for it * ODS Applications - Added ACL eXecute flag - Added RDF/XML and TTL representations to Offers - Added SIOC object services - Added WebID verification service - Added annotation rules - Added app discussion rules - Added discussion IRIs - Added header and head links for IRIs - Added ldap schema support to WebID - Added mail verification service - Added support for WebID idp - Added user's rewrite rules - Added user/mail availability action - Updated CKEditor to version 3.6.1 - Fixed ++ - Fixed ACL using patterns - Fixed API functions - Fixed call auth check only when needed - Fixed changing/deleting events does not trigger re-sync with publication - Fixed Delicious import/publish - Fixed description presentation - Fixed Facebook UI - Fixed IE JS problems - Fixed import atom sources - Fixed move/copy API with wrong source/destination - Fixed navigation and UI - Fixed Offers, Likes and Dislikes, Topic of Interest - Fixed search RSS problem - Fixed SIOC RDF links API functions - Fixed typo in messages - Fixed WebDAV selection Other links: Virtuoso Open Source Edition: * Home Page: * Download Page: OpenLink Data Spaces: * Home Page: * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): OpenLink AJAX Toolkit (OAT): * Project Page: * Live Demonstration: * Interactive SPARQL Demo: OpenLink Data Explorer (Firefox extension for RDF browsing): * Home Page: Regards, ~Tim" "how do I get all people and their external links?" "uHi, I'm trying to get a list of all people and their external links. I tried various permutations on: SELECT * WHERE { ?person rdf:type . ?person ?link. } LIMIT 20 But unfortunately this returns only one link per person. What am I doing wrong? The content of this e-mail (including any attachments hereto) is confidential and contains proprietary information of the sender. This e-mail is intended only for the use of the individual and entities listed above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication (and any information or attachment contained herein) is strictly prohibited. If you have received this communication in error, please notify us immediately by telephone or by e-mail and delete the message (including the attachments hereto) from your computer. Thank you! uHi Yonatan, There is actually nothing wrong. You just pick 20 random results, where by chance no person has more than one link. You could for example order result by person (which is quite slow): SELECT * WHERE { ?person rdf:type . ?person ?link. } order by ?person LIMIT 20 or filter for a special person: SELECT * WHERE { ?person rdf:type . ?person rdfs:label \"Bill Gates\"@en. ?person ?link. } Greets, Benjamin Von: Yonatan [ ] Gesendet: Sonntag, 31. Oktober 2010 14:43 Bis: Betreff: [Dbpedia-discussion] how do I get all people and their external links? Hi, I’m trying to get a list of all people and their external links. I tried various permutations on: SELECT * WHERE { ?person rdf:type . ?person ?link. } LIMIT 20 But unfortunately this returns only one link per person. What am I doing wrong? The content of this e-mail (including any attachments hereto) is confidential and contains proprietary information of the sender. This e-mail is intended only for the use of the individual and entities listed above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication (and any information or attachment contained herein) is strictly prohibited. If you have received this communication in error, please notify us immediately by telephone or by e-mail and delete the message (including the attachments hereto) from your computer. Thank you! uI understand. Thanks very much! From: Benjamin Großmann [mailto: ] Sent: Sunday, October 31, 2010 11:03 PM To: Yonatan; Subject: Re: [Dbpedia-discussion] how do I get all people and their external links? Hi Yonatan, There is actually nothing wrong. You just pick 20 random results, where by chance no person has more than one link. You could for example order result by person (which is quite slow): SELECT * WHERE { ?person rdf:type . ?person ?link. } order by ?person LIMIT 20 or filter for a special person: SELECT * WHERE { ?person rdf:type . ?person rdfs:label \"Bill Gates\"@en. ?person ?link. } Greets, Benjamin Von: Yonatan [ ] Gesendet: Sonntag, 31. Oktober 2010 14:43 Bis: Betreff: [Dbpedia-discussion] how do I get all people and their external links? Hi, I'm trying to get a list of all people and their external links. I tried various permutations on: SELECT * WHERE { ?person rdf:type . ?person ?link. } LIMIT 20 But unfortunately this returns only one link per person. What am I doing wrong? The content of this e-mail (including any attachments hereto) is confidential and contains proprietary information of the sender. This e-mail is intended only for the use of the individual and entities listed above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication (and any information or attachment contained herein) is strictly prohibited. If you have received this communication in error, please notify us immediately by telephone or by e-mail and delete the message (including the attachments hereto) from your computer. Thank you!" "Template not showing in statistics?" "uEnglish Wikipedia has a template called \"Singlechart\" that is used on many (~10 000) musical singles[1]. However, the DBpedia statistics page doesn't show any uses of this template at all[2]. Is there something special about this template? [1]: [2]: uHi Rob, tl;dr: Singlechart is excluded on purpose. Including it may make sense, and we could probably even define a useful mapping for it (as an IntermediateNodeMapping [3], to attach its data to the main resource for the single). If you want that to happen, feel free to file an issue :-) [4] Boring details: At some point, DBpedia realized that some templates are not very helpful (IntermediateNodeMapping didn't exist at the time) and added a heuristic to exclude them. Only templates that have at least two named properties and at least 75% named properties are extracted. See lines 120 and 121 in InfoboxExtractor [1]: val countExplicitPropertyKeys = propertyList.count(property => !property.key.forall(_.isDigit)) if ((countExplicitPropertyKeys >= minPropertyCount) && (countExplicitPropertyKeys.toDouble / propertyList.size) The settings are in InfoboxExtractorConfig [2]: val minPropertyCount = 2 val minRatioOfExplicitPropertyKeys = 0.75 The Singlechart template fails this test. Examples: {{singlechart|Austria|4|artist=Frankie Goes To Hollywood|song=Relax}} two unnamed, two named = 50% named {{Singlechart|Hungary|7|artist=Celine Dion|song=I Drove All Night|year=2003|week=23|accessdate=11 October 2014}} two unnamed, five named = 71.4% named It usually has at least two named properties, but never at least 75% named properties. But InfoboxExtractorConfig also contains this comment: // When you generate statistics, set the following to true. To get full coverage, you should // probably set most other parameters here to zero or empty values. If we set minPropertyCount = 0 and minRatioOfExplicitPropertyKeys = 0.0, Singlechart would be included, but probably many other templates of dubious usefulness, so some other parts of the system would probably have to be changed as well. The heuristic should probably be refined, not abandoned. Cheers, JC [1] [2] [3] [4] On 9 July 2014 09:23, Rob Hunter < > wrote: > [2]: >" "No images in some language chapters." "uHi, Why is it that in the latest datadumps [1] there are no images for some language chapters, f.i. Dutch? Can this be fixed? Thanks, Roland [1] Downloads2015-10" "ANN, the new DBpedia Association" "u(please forward) Dear all, we are very happy to announce that we have succeeded in the formation of a new organization to support DBpedia and its community. The DBpedia Association is now officially in action. In the coming months, we hope to raise funding to reach some of the goals outlined in our charter: At the European Data Forum in Athens today and tomorrow, you are able to meet many of the people from the DBpedia Community who have helped to create the DBpedia Association for example Martin Kaltenböck, Michael Martin and Dimitris Kontokostas will be at the the LOD2 Booth at EDF and Asunción Gómez Pérez will be at the LD4LT side event on Friday (Just to name a few) From September 1st-5th in Leipzig, we hope to gather everyone to celebrate this great achievement. Especially on the 3rd of September, where we will have the 2nd DBpedia Community meeting, which is co-located with the SEMANTiCS 2014 (formerly i-SEMANTICS) on September 4-5. The people who have all worked together to create this wealth of value under the DBpedia name are so numerous that we are hardly able to know their exact number or all their names. For proper acknowledgement, as a first action the DBpedia Association will start to give out Linked Data URIs during the next months for all its contributors and supporters. Personally, I am very proud to live in such a great age of collaboration where we are able to work together across borders and institutions. Hope to see you in person in September or earlier as linked data under the Sebastian Hellmann" "Large companies all have exactly 151, 000 employees?" "uThis example query from the DBpedia page returns a list of companies which all have exactly 151,000 employees: That seems a rather improbable result.  I'm not a real SPARQL guru, but I don't see anything obviously wrong with the query.  Is the query incorrect or is the issue with the database or the server? Tom uTom Morris wrote: uOn Fri, Apr 16, 2010 at 11:43 PM, Kingsley Idehen < > wrote: uOn Fri, Apr 16, 2010 at 5:46 PM, Dan Brickley < > wrote: uDan Brickley wrote:" "DBpedia Extractors not deterministic?" "uHello all, I am using the DBpedia framework since two weeks. Today I found out that after performing the same extraction job several times I did not get the same results. I used a Wikipedia dump including 100 articles. Using e.g. the PageIdExtractor I got 80, 86, 74, 92 lines in the resulting *.nq-file, only one time 100. Why I do not get the same result everytime? Does anybody knows this problem? Thanks, alex uHi Alexander, thank you for reporting this issue. It seems there has been a race condition, which resulted in the framework not extracting the last few pages in the queue. I just committed the fix for this. Cheers, Robert On Tue, Apr 20, 2010 at 3:22 PM, Alexander Larcher < > wrote:" "What happened to the website ?" "uHi, Something really strange happened with the DBpedia main site under especially now that we have the Google Summer of Code running. List of issues: 1) DBpedia is spelled wrong on the main site, it's *DBpedia *not *DBPedia.* 2) The logo is wrong, instead of the DBpedia logo[1], there is a big W [2] as a logo which looks as someone took the design from the 90's 3) There are big large error messeges first thing on the main page: *Could not load or parse feed* 4) Due to the messed up wiki a lot of the wiki pages disappeared from Google. For example the GSOC 2015 ideas page can not be found anymore, which is a big problem [3] ! 5) Going on the main site under sends you a binary file called \"download\" . Furthermore the issues with the Main site have been going on for years. Almost every week the website is down. Whenever Virtuoso crashes it takes the presentation site with it. We're using this wiki for GSoC which makes it really problematic when stuff like this happens. We really need a permanent solution for this, asap. [1] [2] [3] ideas now) Hi, Something really strange happened with the DBpedia main site under uOuch. Looks like someone changed the Wacko Wiki installation. Probably a version upgrade? Another problem: Spam is much worse than before. Bots create about five to fifteen new users per day. Several link spam pages. On Mon, Mar 23, 2015 at 5:14 PM, Alexandru Todor < > wrote: uHi everyone, Is there something we can do to bring back the previous Wiki, or ideally to have a fancier homepage? Let me know if I can help. Cheers! On 3/23/15 5:48 PM, Jona Christopher Sahnwaldt wrote: uHi Marco, Actually, there is a new website for DBpedia. It just has to be filled with (new) content. There is this Communications Working Group (in which I participate and is sort of coordinated by Martin Kaltenboeck), but the problem is that this must be done in leasure time and we are all busy and busy. So, I think it would be best to do some organised effort to get the new site going. If we would expect this to last long, reparations to the old wiki would be necessary, but I rather wouldn't do such a thing. Regards, Gerard Van: Marco Fossati < > Verzonden: woensdag 25 maart 2015 11:23 Aan: Jona Christopher Sahnwaldt; Alexandru Todor CC: Onderwerp: Re: [Dbpedia-discussion] What happened to the Hi everyone, Is there something we can do to bring back the previous Wiki, or ideally to have a fancier homepage? Let me know if I can help. Cheers! On 3/23/15 5:48 PM, Jona Christopher Sahnwaldt wrote: uHi Gerard, You read my mind, I remember the new website design was presented during the 2nd meeting in Leipzig. I can volunteer to bring it up if I am given the material and the credentials on the host. Let me know. On 3/25/15 2:57 PM, Kuys, Gerard wrote: uI talked to Dimitris yesterday and he told me the issues arose by an update of the wiki. It's usually not a big problem if there are issues with the main site, now was just a time for this to happen. We should discuss getting the new website online in the dev telco next week. I think if 3-4 people sit down in an afternoon we can port most of the content to Drupal. If it's wanted/needed I can volunteer for that and I can also \"volunteer\" one or two of my students for it. I think we're at a point where it's better to have a half-finished new website than the current one. Cheers, Alexandru On Wed, Mar 25, 2015 at 3:10 PM, Marco Fossati < > wrote: uSounds good, I added the discussion point to the minutes doc On 3/25/15 4:18 PM, Alexandru Todor wrote:" "Fetching song and movie related information" "uHi, I am interested in fetching data related to songs like song name, singer, lyrics, song length and movie name in which the song is present along with movie related information like movie director, producer, label, music director, release year. The information is available properly in wikipedia but when I try to extract the information from DBPedia the information is not matching with that of wikipedia. Is it because the data dumps used for DBPedia are not the latest dumps of Wikipedia? Is there a way to fetch the above data from DBPedia and what would be the confidence level? Thanks and regards, Venkatesh Channal Hi, I am interested in fetching data related to songs like song name, singer, lyrics, song length and movie name in which the song is present along with movie related information like movie director, producer, label, music director, release year. The information is available properly in wikipedia but when I try to extract the information from DBPedia the information is not matching with that of wikipedia. Is it because the data dumps used for DBPedia are not the latest dumps of Wikipedia? Is there a way to fetch the above data from DBPedia and what would be the confidence level? Thanks and regards, Venkatesh Channal uHi, I executed the sparql query provided by Pablo on length information. Select distinct * where { ?song < The query executed but the information that is available is not matching with the information in Wikipedia. As an example, One of the returned triples is after executing the : I then did a describe on the subject returned Describe I was expecting to find the singer information, album information, producer, movie director other song and album related information. The corresponding wikipedia link is Here for the song Piyu Bole the information is: No. TitleSinger(s) Length1.\"Piyu Bole \" Sonu Nigam , Shreya Ghoshal 4:21 Album/Movie information in which the song is present is: Directed byPradeep Sarkar Produced byVidhu Vinod Chopra Screenplay byVidhu Vinod Chopra Pradeep Sarkar Story byVidhu Vinod Chopra Pradeep SarkarStarring Vidya Balan Sanjay Dutt Saif Ali Khan Raima Sen Diya Mirza Music byShantanu Moitra CinematographyNatarajan Subramaniam Editing byHemanti Sarkar Nitish Sharma Distributed byVinod Chopra Productions Release date(s) - June 10, 2005 Running time131 minutes CountryIndiaLanguage Hindi Budget[image: INR]25 crore (US$ 4.73 million)[1] Box office[image: INR]32.35 crore (US$ 6.11 million) Information that is different: Producer, Artist name is different in the DBPedia result and Wikipedia. For Artist I expect the information available in the song infobox of wikipedia. Is it because the dump that is used for dbpedia may not be the latest? Thanks and regards, Venkatesh On Tue, Oct 9, 2012 at 9:02 PM, Pablo N. Mendes < >wrote: uHi Venkatesh, On 10/10/2012 10:18 AM, Venkatesh Channal wrote: specifically \" is a movie. Regarding runtime, the runtime is in seconds. So if we consider resource \" as an example, and its corresponding Wikipedia page \" the Wikipedia article that its length is \"6:48\", which is equivalent to 408 seconds. If you want to get more information you should just extend that query, e.g. the following one will give you the producer information as well: Select ?song ?artist ?runtime ?producer ?prodInfo where { ?song dcterms:subject . ?song rdf:type dbpedia-owl:Song . ?song dbpedia-owl:artist ?artist . ?song dbpedia-owl:runtime ?runtime. ?song dbpedia-owl:producer ?producer. ?producer ?prod ?prodInfo } limit 1000 Hope that helps. uHi Mohamed, Thank you for your reply. I wanted to convey the difference in the values rather than to fetch it using sparql query. As I mentioned in my earlier mail,Producer, Artist name is different in the DBPedia result and Wikipedia. For Piyu_Bole the artist is mentioned as where as in wikipedia the artists are: Sonu Nigam , Shreya Ghoshal the producer result from DBPedia is : in wikipedia the result is: Vidhu Vinod Chopra. I am looking for ways to get the same information provided by Wikipedia using DBPedia. Hope the problem is clear now. Thanks and regards, Venkatesh On Wed, Oct 10, 2012 at 7:21 PM, Mohamed Morsey < > wrote: uHi, Just to clarify I am looking to fetch song information and related album information. Song information fields: Song name, lyricist, singers, song length, composer Album information fields : Album/Movie name, language, producer, director, music director, release year It is fine if from a given movie I can fetch all song information or given a song title fetch all the other information. Thanks and regards, Venkatesh On Wed, Oct 10, 2012 at 7:54 PM, Venkatesh Channal < > wrote: uHi, Are you running the query on I don't get any artist associated to You must have got from As Mohamed said, the wikipedia page was redirected to There is something weird though, there is a triple but describe gives only this triple and describe very little. Anyone knows why ? Julien uHi Julien, Thank you for your reply. I have been using along. Earlier Piyu_Bole was one of the replies being returned for song, artist, runtime information with a limit of 100. Now there is no such song with Piyu_Bole. Describe works for . So this entity is present and is a song. but is not being fetched with select query. Is it possible that the dataset may be getting changed in the last 10-12 hours? Just to clarify I am looking to fetch song information and related album information. Song information fields: Song name, lyricist, singers, song length, composer Album information fields : Album/Movie name, language, producer, director, music director, release year It is fine if from a given movie I can fetch all song information or given a song title fetch all the other information. Thanks and regards, Venkatesh On Wed, Oct 10, 2012 at 9:04 PM, Julien Cojan < > wrote: uOk, it is hard to discuss about results on changing data. There there is a mess about this example Piyu_Bole because it was the page of a song : then it was redirected to a film : As far as I understand live.dbpedia.org sparql endpoint contains as well data from the time of dbpedia 3.8 (in graph So if you restrict to graph Julien uOn Wed, Oct 10, 2012 at 12:23 PM, Julien Cojan < >wrote: Which is why you can never really infer anything about what a redirect means on Wikipedia, because it basically means whatever the Wikipedians want it to mean. Sometimes it connects equivalent topics, but often, as in this case, it simply represents something that wasn't \"notable\" enough to have its own page and the redirect goes to something that discusses multiple topics: a film, the soundtrack for the film, the tracks on the soundtrack for the film, etc. Tom On Wed, Oct 10, 2012 at 12:23 PM, Julien Cojan < > wrote: Ok, it is hard to discuss about results on changing data. There there is a mess about this example Piyu_Bole because it was the page of a song : Tom uHi Julien, On 10/10/2012 05:34 PM, Julien Cojan wrote: sorry I didn't get what you mean by very little. I tried query \" describe \", and it gives 109 triples. uHi Mohamed, uHi, I think I am still missing on how to query the information. On executing the query: SELECT * WHERE { ?s < One of the values got was - To find the information about all triples that has the film name as subject. The idea was to find song and singer of those songs. The query executed: select * where { ?p ?o . } One of the values returned is - Here \"Pappu Can't Dance\"@en is a song. I executed the following query to find artist associated with the song that begin with the character 'P'. The song \"Pappu Can't Dance\" was not returned. Select distinct * where { ?song < (regex(str(?song),'P'))} limit 100 The correspong wikipedia link is - http://en.wikipedia.org/wiki/Jaane_Tu_Ya_Jaane_Na Appreciate your feedback and help. Thanks and regards, Venkatesh On Thu, Oct 11, 2012 at 6:20 PM, Julien Cojan < > wrote: uExecuted where? dbpedia.org/sparql? live.dbpedia.org/sparql? On Thu, Oct 11, 2012 at 3:48 PM, Venkatesh Channal < > wrote: uOn Thu, Oct 11, 2012 at 9:48 AM, Venkatesh Channal < > wrote: That article is in Category:Hindi_films, not Category:Hindi_songs and it's a Film, not a song, so it's not going to meet the requirements of your query. It looks like the DBpedia extractor attempted to extract as much information as possible from the page, but that strategy, combined with the way Wikipedians edit is causing confusion. Even the article contains infoboxes for both the film and the soundtrack album, the subject is principally, in my opinion, the film. Including triples related to the soundtrack album associated with the same URI is just going to cause confusion. Tom uHi, Pablo - The query was executed at Tom - The first query identifies the name of the movie/film - \"Jaane TuYa Jaane Na\" From the movie I get the list of songs. One of the songs is - Pappu Can't Dance. Then I try to execute the query to find the artist information for all song having 'P' expecting Pappu Can't Dance to be one of the songs. It is not listed though. Thanks and regards, Venkatesh On Thu, Oct 11, 2012 at 7:45 PM, Tom Morris < > wrote: uGood point, Tom. That article is in Category:Hindi_films, not Category:Hindi_songs and it's But maybe the class hierarchy comes to the rescue (Work is a supertype of Song and Film)? Select distinct * where { ?song < (regex(str(?song),'P'))} limit 100 Cheers, Pablo On Thu, Oct 11, 2012 at 4:15 PM, Tom Morris < > wrote: uEven with rdf:type Work the song starting with \"Pappu\" is not among the returned values. Regards, Venkatesh On Thu, Oct 11, 2012 at 8:33 PM, Pablo N. Mendes < >wrote: uOn Thu, Oct 11, 2012 at 11:03 AM, Pablo N. Mendes < >wrote: The main point is that the extracted triples are semantic nonsense because they conflate multiple subjects under a single URI. You've got \"9300.0\"^^ . \"276.0\"^^ . \"1908.0\"^^ . \"221.0\"^^ . Which is the length of what? They all refer to the same subject. Similarly \"Pappu Can't Dance\" isn't a song. It's an (alternate?) title for the film according to the RDF. A human knows it's a song because of the \"Pappu Can't Dance\"@en . \"155.0\"^^ . To make what Venkatesh wants to work happen, you'd need to teach the extractor to figure out what the \"main\" subject of a page was and then have it mint new subject URIs for all related concepts represented on the page which are different (and don't have their on Wikipedia page) such as sound track album, songs on a sound track album, etc. Then you'd also need to teach it that the physical proximity of the track listing and the soundtrack infobox implies that they refer to the same subject. Finally, you'd have to make this robust in the face of different editing & structuring styles by different Wikipedians. I'd love to see the extractor get this smart, but I'm not holding my breath. Tom uIs this the page? It contains infobox film, so it should be of type Film, as Tom says. It is also not in the category you said, as Tom also pointed out. Check the wikipedia page. There is another problem. For some reason, artist and runtime information were not extracted. Look at this query: Select distinct * where { ?song < ?song rdfs:label ?label. ?song rdf:type . optional { ?song ?artist . } optional { ?song ?runtime . } filter (regex(str(?label),'Pappu'))} limit 100 Also worth pointing out, there is a bug in the Linked Data hosting for URIs with \"special chars\". See: And contrast with: describe Cheers, Pablo On Thu, Oct 11, 2012 at 5:07 PM, Venkatesh Channal < > wrote: uOn Thu, Oct 11, 2012 at 11:24 AM, Pablo N. Mendes < >wrote: No, he's talking about this: It's a page about a film with both film and soundtrack infoboxes. If you look at the triples extracted : you can see that both sets of properties have been applied to the same subject. Tom uHi Venkatesh and all, the question here is \"is there a Wikipedia page for song \"Pappu Can't Dance\"? \". On 10/11/2012 05:07 PM, Venkatesh Channal wrote: uHi Mohamed, There is no Wikipedia page for song \"Pappu Can't Dance\". Do you mean to tell that if a song is part of a movie and there is no Wikipedia page separately for the song then that song can't be retrieved in DBPedia's live sparql endpoint? Regards, Venkatesh On Thu, Oct 11, 2012 at 9:24 PM, Mohamed Morsey < > wrote: uHi Venkatesh, On 10/12/2012 12:39 PM, Venkatesh Channal wrote: yes that is what I meant. If you have also read Tom's last mail, he has proposed that the extractors should be smart enough to detect that a Wikipedia article may contain more than one infobox, one for the main subject of the article and some others about some related topics. And if it is the case, they should create URI(s) for those related subject(s) and extract data from their infoboxes as well. uHi Tom, On 10/11/2012 05:24 PM, Tom Morris wrote: that's a good idea, I agree with you, that if those subtopics are also taken into consideration, that would be a great achievement for DBpedia uHi, Thank you for clarifying it to me. I understand now that currently extractor don't have that capability to extract to such a sub-level. Is there mapping done for If yes, is the dbpedia data available on live sparql endpoint? The above templates seem to have the necessary information made available. If there is a html parser available I found the element * table class=\"tracklist\"* seems to have the necessary information. If required I can go through the various formats in which the song elements are represented on Wikipedia. Regards, Venkatesh On Fri, Oct 12, 2012 at 7:59 PM, Mohamed Morsey < > wrote: uHi Venkatesh, On 10/12/2012 04:51 PM, Venkatesh Channal wrote: Yes there mappings for Infobox Song, but not for Template:Tracklisting/doc. You can always check the mappings on the mapping Wiki available at You may try JSoup [1], JTidy [2], or HTMLParser [3] Hope that helps. [1] [2] [3] uHi Mohamed, Thank you for pointing to the mapping link and the html parsers. Everytime I try to test the mapping by clicking on the link - I get error with message - \"Service Temporarily Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.\" Has the link changed? Is there a timeframe during the link can be tested? Thanks and regards, Venkatesh On Sun, Oct 14, 2012 at 2:12 PM, Mohamed Morsey < > wrote: uHi, On executing the query to find the number of songs that are having information and of type Infobox_song: SELECT count(*) WHERE {?s rdf:type dbpedia-owl:Song . } The count returned is 6004. The value seems to be very less. Is it the right count? Regards, Venkatesh On Mon, Oct 15, 2012 at 2:01 PM, Venkatesh Channal < > wrote: uHi Venkatesh, On 10/15/2012 10:31 AM, Venkatesh Channal wrote: I'm not quite sure about the cause of that problem, but as far as I know is that a few days ago there was a plan to move the mappings wiki to a new faster server. By the way, this mappings wiki is open for contribution, so you can contribute to it and add/revise more mappings in order to make it better. You should only ask for write access to that wiki to be able to do that. uHi Venkatesh, On 10/15/2012 04:29 PM, Venkatesh Channal wrote: Yes that number is correct, but take care that there is another mapping for infobox \"Single\", which gives way more results. I guess that answers your question. uHi Mohamed, Thank you. I am trying to understand the mapping fully before I can start giving suggestions for updating it. Regards, Venkatesh On Mon, Oct 15, 2012 at 8:09 PM, Mohamed Morsey < > wrote:" "Why are sparql results incomplete?" "uI have asked a version of this question before but never received a satisfactory response. The best answer I got was that different endpoints serve different data due to resource limitations. I hope I am not rude but I will try again, in the hope there is a clear answer. If you look up Yet a query about Maximilien_Robespierre against dbpedia.org/sparql or live.dbpedia.org/sparql OR a query against my own mirror populated with the latest dbpedia dump gives me hardly any useful results (e.g. select * where { { ?p ?o . } UNION { ?s ?p1 . }} ) Which data is Why can’t we get a dump of that? (Even if the endpoints didn’t serve up the complete data set, at least users could mirror the compete set). regards, Csaba I have asked a version of this question before but never received a satisfactory response. The best answer I got was that different endpoints serve different data due to resource limitations. I hope I am not rude but I will try again, in the hope there is a clear answer. If you look up Csaba uHi Csaba, at least for your example this is caused by a redirect from Maximilien_Robespierre to Maximilien_de_Robespierre. When visiting this URI, my browser redirects me to which contains the large amount of data you are referring to. (Also see the dbpedia-owl:wikiPageRedirects property of When adapting the query to match the redirect: select * where { { ?p ?o . } UNION { ?s ?p1 . }} I get the information as shown before. Hope that helps. Cheers, Daniel uOn Mon, Nov 17, 2014 at 4:14 AM, Daniel Fleischhacker < > wrote: There are some answers on Stack Overflow (disclaimer, I know them because I wrote some of them :)) that address how to work with redirected resources in SPARQL: * [Retrieving properties of redirected resource]( * [Obtain linked resources with SPARQL query on DBpedia]( * [Retrieving dbpedia-owl:type value of resource with dbpedia-owl:wikiPageRedirect value?]( The most common theme is querying with something like select ?p ?v { dbpedia:Foo dbpedia-owl:wikiPageRedirects* ?foo . ?foo ?p ?v } //JT" "Problem with DBPedia SPARQL endpoint" "uHi, I have a problem with a DBPedia SPARQL endpoint. When I am on the PREFIX foaf: select distinct ?s where { ?s foaf:name ?o . FILTER regex(str(?o), \"^Brad\") } And this error occurred : Virtuoso S1T00 Error SR171: Transaction timed out SPARQL query: define sql:big-data-const 0 #output-format:text/html define sql:signal-void-variables 1 define input:default-graph-uri PREFIX foaf: select distinct ?s where { ?s foaf:name ?o . FILTER regex(str(?o), \"^Brad\") } The same thing occurred when I add a LIMIT. It’s because my SPARQL query is wrong or for other else ? Thanks in advance. Regards. uHello Julien, Your query takes so long that it runs over the limits set on the dbpedia endpoint at this time. I ran the query on the dbpedia.org cluster it took 779139 msec which is about 13 minutes to run your original query and return 1441 rows. The problem with your query is the REGEX function which basically forces a table scan on the data as this cannot be done via an index to speed the query up. Note that since you use the DISTINCT keyword, the virtuoso cluster basically cannot shortcut the query. The good news is that Virtuoso has a couple of options that you can use to speed up your query: 1. Use an ANYTIME query This means that you fill in the \"Execution Timeout\" field in the / sparql form with say a value of 5000 (in msec) which means that your query basically is transformed into: Select all the triples that satisfy where the predicate is foaf:name and where the string of the name begins with Brad, that you can find within about 5 seconds of processing, and then return the unique ?subject. Obviously this will not return all the triples you might expect, but it does return quickly. 2. Use BIF:CONTAINS Virtuoso can search very efficiently on the ?o as it maintains a complete freetext index on it. You can use the following query: PREFIX foaf: select distinct ?s where { ?s foaf:name ?o . ?o bif:contains \"Brad\" . } which will basically return every triple that contains the word Brad anywhere in the foaf:name. This is very very efficient, but will get you not only people like: \"Brad Davis\"@en \"Brad Strickland\"@en but also \"Edward Brad Titchener\"@en You can combine this with the FILTER if you really only want foaf:name that start with Brad and use: PREFIX foaf: select distinct ?s where { ?s foaf:name ?o . ?o bif:contains \"Brad\" . FILTER regex(str(?o), \"^Brad\") } Since the bif:contains works over a very efficient index, the FILTER only has to go through a very small number of triples and still return the exact same result you would have expected from your original query. Hope this solves your problem. Patrick uThanks for your help, it works fine :-) I also found this solution which works fine too and it not depends of a Virtuoso endpoint : PREFIX rdf: PREFIX rdfs: SELECT DISTINCT ?s WHERE { { SELECT DISTINCT ?s WHERE { ?s rdfs:label ?o ; rdf:type yago:AmericanFilmActors . FILTER regex(str(?o), \"^Brad\") . } LIMIT 100 } } I don't have all the results but if I put this query in a loop with a OFFSET this should be good. Thanks again for your quickly answer. Regards." "'Person' disambiguation to verify Identity over the web" "uHi Devs, I'm Dileepa Jayakody a research student keen on distributed computing domain, particularly interested in social identity, identity management and semantic web concepts. I have started studying for a M.Sc by Research at University of Moratuwa and recently joined LK Domain Registry as a research assistant and currently doing research in 'verified digital identity' domain. I'm looking at protocols like WebID [1], FOAF, linked-data and related semantic-web concepts to implement a methodology to verify digital identity of people and organizations on the web. I'm also interested about the GSoC ideas in dbpedia-spotlight on \"Efficient graph-based disambiguation and general performance improvements\" and \"Generalize input formats and add support for Google mention corpus\" would like to get your ideas on how much relevant they are to my research project of developing a identity verification framework over web of data. Does dbpedia support WebID protocol? I think it will be great to integrate WebID protocol in dbpedia framework and implement a identity verification framework on top of thatIs this a valid use case from dbpedia POV? I have built both dbpedia extraction-framework and dbpedia-spotlight from source and looking into the code at the moment. Thanks and regards, Dileepa [1] WebID uI added the dbpedia-discussion list again in this thread. Maybe we get some feedback from the community Best, Dimitris On Wed, Apr 24, 2013 at 12:09 AM, Dileepa Jayakody < > wrote:" "DBpedia and Yago" "uHi guys, I have a few questions about how DBpedia is using Yago, and hope you may be able to help:) So you've defined a bunch of classes in the DBpedia URI space, based on entities from Yago, e.g. and , right? This is really cool, as having something solid to set rdf:type statements against is great, and we'd really like to reference these URIs on a large scale. However, a few things aren't totally clear to me: 1) Is there any reason why these classes are lowercased rather than following the convention to uppercase the first letter of classes? This make shorthand such as a bit harder to parse visually. 2) Are there any plans to make these URIs dereferenceable and perhaps described in RDFS/OWL? 3) Any news on this? We're shortly going to be setting a lot of links in Revyu, and I'm hesitant to use something that might change. Is it safer just to assume dbpedia.org/class/ and hope that everything falls into place??! One last thing, at you say that: , which if I've understood everything correctly isn't strictly true :) Cheers, and thanks again for creating such a great resource, Tom. uJust one more thingdo you have a definitive list anywhere of the Yago entities you have adopted as DBpedia Yago Classes? Cheers, Tom. On 21/06/07, Tom Heath < > wrote: uHi Tom, Fabian and Chris certainly know better, but for another purpose I got from Fabian the classes and subClasses used by Yago, which are the same as in the WordNet distribution. There are about 45,000 unique of these; I have a cleaned up xls listing at: Otherwise, you can download the Yago package and extract yourself under the facts/subClassof directory from: Also, as a way to prioritize classes viz Zipf's law, you can get WordNet frequencies via the Prolog version ( info from Fabian if you need it. Thanks, Mike Tom Heath wrote: uGreat, thanks Mike, that's really helpful. I also just found (duh, why didn't I look there before!), so together that answers a number of my questions. Thanks again :) Tom. On 21/06/07, Michael K. Bergman < > wrote: u(A more detailed answer from Georgi might be forthcoming) On 21 Jun 2007, at 18:34, Tom Heath wrote: A bug, will be fixed. They will be made dereferenceable. Unfortunately the only description we can provide are labels and possibly a list of instances. We haven't decided on an approach yet. We don't have any ready-made high-quality class hierarchy available that we can use, and there are different possible heuristics that can be used to create a hierarchy and classify the DBpedia instances, which all have their pros and cons (e.g. YAGO, using Wikipedia categories, using Wikipedia templates, using a combination of these). We have discussed three possible approaches: 1. Keep the classes generated through different heuristics in different namespaces, like we do now with dbpedia/class/yago/ 2. Have just one main, “canonical” dbpedia/class/ namespace, where the class would basically be the one described by the corresponding resource at dbpedia/resource/ 3. Use DBpedia resources as classes, e.g. rdf:type It's not clear which one is the best choice. Hearing more opinions would be helpful. A bug in the YAGO dataset; there are quite a number of misclassifications of this kind in YAGO, unfortunately. Richard uHi Richard, Thanks for the reply. I happened to be speaking to Chris earlier and we discussed a few thingsreplies inline and on-list for the record Great, thanks. Georgi (et al), Chris and I discussed a couple of actions: 1) if your generation script can be updated to uppercase the first letter of class names that would be great, and will solve the issue when the script is next run. 2) Here at KMi we'll process the existing file at , uppercase the class names, and provide the resulting modified file back to you guys, to replace the current download at that address; this will serve as an interim solution until the script is next run. One questionThere are currently classes like . Do we want to adopt a different syntax that doesn't use the underscores, e.g. ? A quick answer/discussion on this would be useful so we can act on this this afternoon. Peter Coetzee and I here at KMi will investigate generating an RDFS class hierarchy from the YAGO data, and providing this back to you guys for import into Virtuoso, which can then (pretty much) solve the dereferencing issue. I like option 1 the best. It leaves the door open for many complementary class hierarchies to be used, and clearly distinguishes each one. From speaking to Chris this seemed a good way to proceed. On that basis we will very shortly (next few days) start to use classes such as , and assume the capitalisation of the classes (also subject to the syntax discussion above). OK, that's good to know, thanks. I guess/hope this might be reoslved over time. Cheers, Tom. uTom Heath wrote: +1 for this time tested best practice :-) Nice! +1 Kingsley uOn 22 Jun 2007, at 14:40, Tom Heath wrote: Yes, CamelCase without underscores is the way to go. Richard uHi Richard, Richard Cyganiak wrote: [snip] Any possibility you might consider SKOS? [snip] Thanks, Mike uHi Mike, Technically we can do whatever the community decides is best, so the simple answer is yes :) Chris suggested using an RDFS class hierarchy as the basis for some simple reasoning over the data, which sounded reasonable to me. Could you elaborate on the benefits of going for SKOS over RDFS? Cheers, Tom. On 22/06/07, Michael K. Bergman < > wrote: uTo All, Of course, SKOS is in RDFS. The reason to use SKOS is that is it designed specifically to represent and describe knowledge bases for a wide variety of ontological formalisms from controlled vocabularies, to thesauri, to tags and folksonomies to RDFS itself, etc. Just as FOAF (people), DOAP (projects), SIOC (semantic communities) or GeoNames (places; see [1]) are the preferred RDFS vocabularies in their respective domains, so should SKOS be considered for knowledge bases. The specific properties in SKOS such as prefLabel, altLabel, narrower, broader, related and hasTopConcept are well suited to term and category relationships in things like WordNet or DBpedia or GeoNames or whatever, enabling both networked and hierarchical displays and retrievals of concepts. If you are not familiar with SKOS, the W3C site [2] is a great starting point; I also recommend the various use cases that are emerging from the effort [3]. I will shortly be releasing a discussion of an open community project called UMBEL, to be based on SKOS, that will also provide lightweight subject bindings for mapping and registering various KBs. More background will be forthcoming at that time. I hope, Tom, that I am not repeating what you already know, and I may not have all of the details absolutely correct because I'm still learning. In any event, since DBPedia is becoming the flagship for Linked Data, and many additional data sets are adding to it, I think it appropriate that SKOS leadership be shown as well. Just my opinion :) Mike [1] [2] [3] Tom Heath wrote: uJust for the record: dbpedia uses SKOS to represent the Wikipedia category system. Richard On 23 Jun 2007, at 18:59, Michael K. Bergman wrote: uHi Mike, all, No worries, me too :) Yes, absolutely! As Richard helpfully points out, DBpedia is using SKOS to represent the Wikipedia category system, which seems right to me. I guess my concern/question is about whether we can say that a is of rdf:type , where is of rdf:type skos:Concept, which is what we'd end up with if we modelled YAGO in SKOS and then set rdf:type statements against these concept URIs. This just doesn't seem right to me because of SKOS's layer of indirection, unlike saying that skos:subject , where rdf:type skos:Concept, which seems fine. Just as FOAF is an ontology for describing things (mainly people), SKOS is an ontology for describing concepts. I think the SKOS Core Guide is useful in this context, in Open Issues->Relationship to RDFS/OWL Ontologies [1]: \"There is a subtle difference between SKOS Core and other RDF applications like FOAF [FOAF], in terms of what they allow you to model. SKOS Core allows you to model a set of concepts (essentially a set of meanings) as an RDF graph. Other RDF applications, such as FOAF, allow you to model things like people, organisations, places etc. as an RDF graph. Technically, SKOS Core introduces a layer of indirection into the modelling.\" \"This layer of indirection allows thesaurus-like data to be expressed as an RDF graph. The conceptual content of any thesaurus can of course be remodelled as an RDFS/OWL ontology. However, this remodelling work can be a major undertaking, particularly for large and/or informal thesauri. A SKOS Core representation of a thesaurus maps fairly directly onto the original data structures, and can therefore be created without expensive remodelling and analysis. \"SKOS Core is intended to provide both a stable encoding of thesaurus-like data within the RDF graph formalism, as well as a migration path for exploring the costs and benefits of moving from thesaurus-like to RDFS/OWL-like modelling formalisms.\" In this context, with YAGO being an ontology already, would it be a shame to not make use of the modelling capabilities of RDFS? I also note the bug logged at [2]: \"Do not use Wikipedia categories for rdf:types. The next version of the dataset should only use YAGO classes as rdf:types and shouldn't use Wikipedia categories any more\". Chris, as logger of that bug, is there any chance of elaborating on your rationale? Cheers :) Tom. [1] [2] index.php?func=detail&aid;=1722285&group;_id=190976&atid;=935520 uHi Tom, check. Well, I would have voted for underscores as that is consistent with our resource URIs, but Richard pointed that CamelCase is the way to go for classes. I like option 2, because it fits best with our goals I think. We want the DBpedia classes to be used within other datasets and to integrate these datasets. And integration is a lot easier if one classification is used. Otherwise we would have to interlink the different classification systems and that would also cause new problems. Cheers, Georgi uHi Georgi, all, On 24/06/07, Georgi Kobilarov < > wrote: Great :) Hopefully soon we can reach a final consensus, at which point both your script and ours can be modified accordingly. In addition to underscores we also have hyphens to deal with, in URIs such as: and (where the \"_lowercase\" has already been CamelCased). I'd suggest that we follow the same rule for dealing with hypens, such that we end up with: and What do people think about this? Hmm, I see the issue a little differently; something like this: \"many different people/organisations produce class hierarchies, such as Wordnet, YAGO, whatever. DBpedia can help these exist and be useful within a web of data by providing stable, dereferenceable URIs for the classes, using conventions that exist within the community (CamelCase, HTTP303 redirects etc). With a stable and reusable home, these class hierarchies can be used to enhance the DBpedia dataset (and others, e.g. Revyu, hence this thread ;) by providing things to link to in rdf:type statements, for example.\" So far I think we're in full agreement :) (Just for the record, I don't really care which classification they come from, I just want stable, (likely to be) widely used classes for saying that things on Revyu are of rdf:type pub, restaurant, film, tourist attraction, whatever; nothing more complex than that. This quite complex thread is simply a by-product (ByProduct? ;) of that desire). Regarding I have to (largely) disagree. Integration is a lot easier if one classification is used, but this depends on everyone agreeing to adopt the same classification (and on something meaningful and comprehensive and coherent having been produced). In my experience trying to reach this kind of agreement is the prelude to all manner of problems that are far worse than the challenges posed by trying to integrate data that uses different classifications. Integrating two classification schemes at certain points by defining mappings between specific classes is much more feasible than trying to integrate entire hierarchies. The Semantic Web provides us with an infrastructure for mapping between different ontologies and thereby integrating disparate data sets, so lets use it. Keeping DBpedia \"classification-agnostic\" strikes me as the most Web-like way to proceed. Trying to unify Yago, Wordnet, \"A.N.Other classification scheme\" with some DBpedia super-classification scheme sounds like a recipe for years of blood, sweat, and tears. :( Tom. uI completely agree with Tom on this point and I clearly favour having different classification schemata on top of DBpedia so that people can choose the schema that fits their needs best. Cheers Chris uHi folks, One more question, and hopefully the last for a while;) What do people think about this \"hyphens in Class names\" issue? On 25/06/07, Tom Heath < > wrote: interim re-processing of the data at to conform with this format until the modified generation scripts are run. Cheers, Tom. uHi Tom, I'd suggest to deal with hyphens like with underscores. CamelCase them! No mercy :) Ok, so tell me when the new dump is ready, and I'll replace Cheers Georgi uNo mercy, I like it ;) Peter, please could you modify your script accordingly? We'll get the modified dump to you asap Georgi. Cheers, Tom. On 27/06/07, Georgi Kobilarov < > wrote: uWithout mercy it is then :) I've reprocessed the latest yagoclasses.zip from the DBpedia downloads, and Tom's kindly put it up on . You'll find the CamelCased N-Triples in there, as well as a Linux shell script (req: wget, unzip, php, awk) to download / process the latest yagoclasses.zip from DBpedia into CamelCase (as yagouppercase.nt), as well as generate a sorted list of classes (as yago-classes.nt). Hope that'll be useful for some of you! Cheers, Peter uHi, Ok, I replaced the old with your dump. Thanks! Georgi" "enquiry on wikipedia extraction frame work" "uhi all, recently i tried to run Wikipedia extraction framework on infobox template, and i want to ask a question: already, i get dumped english Wikipedia as mysql files. how can i adjust or configure these dumped MYSQL files to Wikipedia extraction framework (code) thanks alot kind regards, amira Ibrahim abd el-atey" "DBPedia - URI for properties & classes" "uDear all, working on a semantic mediawiki and implementing a data structure on our wiki we are trying to re-use as many properties and classes as possible from DBPedia. However I am totally confused how to refer to the properties & classes. Can someone please point out to me what the correct URI for properties and classes is? E.g. I would like to use: Address, as specified within Theatre: or OpeningYear within Building: also, how to I reference to a class, e.g. Band: due to the different mappings, wikis and other references, I am at loss which is the correct URI for these properties & classes, so that I can reference them in our semantic mediawiki. E.g. I am trying to implement this here and need the correct URI for that: Thanks a lot for your help! Best regards, Max from Rock in China - Dear all, working on a semantic mediawiki and implementing a data structure on our wiki we are trying to re-use as many properties and classes as possible from DBPedia. However I am totally confused how to refer to the properties & classes. Can someone please point out to me what the correct URI for properties and classes is? E.g. I would like to use: Address, as specified within Theatre: - uHi Max, For the DBpedia ontology you can use the rdf/xml owl files from: from the mappings wiki) release You can then reference a class with it's full URI i.e. or an ontology property like Best, Dimitris On Mon, Nov 19, 2012 at 2:00 PM, Max-Leonhard von Schaper < > wrote: uHi Max, On 11/19/2012 01:00 PM, Max-Leonhard von Schaper wrote: The URIs of classes start with \" of that class will be \" The URIs of properties start with \" so the URI of that property is \" Simply, you can refer to the class using its full URI, e.g. \" We are using the mappings wiki [1] to map a specific Wikipedia infobox, e.g. infobox of a \"Book\", to a class in the DBpedia ontology, e.g. ontology class \"Book\". We also use it to map the individual attributes of the infobox to properties in the ontology, e.g. attribute \"translator\" in the book infobox is mapped to property \"translator\" in the ontology. This wiki is open for contribution, so people can register themselves to it and ask for write access, and afterwards they can help us in improving the DBpedia mappings. Hope that clarifies the issues. [1] http://mappings.dbpedia.org/" "Please Urgent Question Need answer !!!!" "uDear All, I have successful in creating the environment for running the extraction process, DB is imported in mediawiki instance, however I do not know what exactly should I edit to extract particular terms.e.g. ontology files? I have a list of terms which I want extract from the Database, however I do not know where to 'put' this list, or in which format should it be within the ontology file. Many thanks in advance! With Regards,Abdullah uAbdullah, the subject of your mails should indicate what your mail is about, for example \"need help with next steps of extraction\". A subject like yours doesn't really tell anyone what your mail is about and most users will ignore it. In addition, using so many exclamation marks is generally frowned upon. We all have urgent questions. Now to your question - as I said before: Do NOT run the import launcher. You only need it to extract abstracts, which is very complicated. Don't try that now, you always do it later if you really want. In other words, you don't need a database or a MediaWiki instance. What do you want to extract? Please describe what you want to achieve. Maybe you're better off downloading some datasets with extracted data and get the data you need from them. JC On 6 August 2013 09:35, Abdullah Nasser < > wrote: uHi JC, Thank very much for your response. Sorry for the topic of question. I wanted the answer urgently :) . I have list of concepts that I want to extract semantic links between them and wiki concepts and retrieve all articles or concepts from DBpedia ontology that semantically related to my concepts. Moreover, I have read that there infobox included in each wiki-page that we can infer for the semantic links relation through these infobox. So, I would like to exploit all semantic links , relations etc Many thanks in advance With Regrads,Abdullah uWhat do you mean by \"semantic links\" and \"semantic relations\"? On 6 August 2013 18:33, Abdullah Nasser < > wrote:" "run dbpedia on local windows machine" "uHi, I am trying to use Virtuoso (open source edition) to host dbpedia data on * Windows*. The Virtuoso service has worked well, but I am wondering how to run the load_nt.sh file, which as you know is for dbpedia data loading. I have tried this way: I runned the script in Cygwin and the running ended with no error coming up. All .nt files were first loading and then loaded from the console message. But the time for loading is not that long as someone mentioned like several hours, it is just several minutes. Seems something wrong, And it is true, when I wrote some sparql queries to verify, each query returned nothing. Nothing is loaded! This is the command I used in Cygwin: sh load_nt.sh localhost:1111 dba dba /cygdrive/d/dbpedia/data \" \" I am not sure the way I used is right or not? Or, the parameters (graph IRI) of the command is wrong? Thanks in advance J Zheng Hi, I am trying to use Virtuoso (open source edition) to host dbpedia data on Windows . The Virtuoso service has worked well, but I am wondering how to run the load_nt.sh file, which as you know is for dbpedia data loading.  I have tried this way: I runned the script in Cygwin and the running ended with no error coming up. All .nt files were first loading and then loaded from the console message. But the time for loading is not that long as someone mentioned like several hours, it is just several minutes. Seems something wrong, And it is true, when I wrote some sparql queries to verify, each query returned nothing. Nothing is loaded!  This is the command I used in Cygwin: sh    load_nt.sh      localhost:1111       dba          dba          /cygdrive/d/dbpedia/data          '< not? Or, the parameters (graph IRI) of the command is wrong?  Thanks in advance J Zheng uHi Kenny/Zheng ???, Firstly I would suggest you subscribe to and also post your questions on the Virtuoso open source mailing list as I presume you are using the dbpedia_install.tar.gz installation archive from OpenLink Software: You can subscribe to the Virtuoso mailing list as detailed at: If you search on the mailing list their you will see their are posts their from users who have installed on Windows using cygwin. Although one point of note is that you need to ensure you are installing on a suitably resourced machine, preferably a windows server variant with at least 8GB of memory, otherwise you will have problems loading all the data sets and running the service itself. Can you provide the output of the following two log files which should give some indication as to the progress of the installation you have attempted: dbpedia_install.sh.log load_nt.sh.log Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 27 Sep 2009, at 10:23, Kenny Guan wrote: uthanks Hugh, Before your reply, I only tried the command here: and now, I know I am supposed to follow the README.text in to run dbpedia_install.sh. I tried and some problems came up. I think they are originally due to the Cygwin environment. Seems, i have to do more settings before I directly run dbpedia_install.sh. Could you give some tips? or direct me to some posts about installation of dbpedia on Windows? *here is messages from dbpedia_install.sh.log:* Install started Mon Sep 28 16:05:50 CST 2009 Checking for VOS setup Starting Virtuoso server, please wait Cannot start Virtuoso server, please consult dbpedia.log file Started. Checking files load_nt.sh : present dbpedia_dav.vad : present dbpedia_post.sql : present dbpedia-ontology.owl : present umbel_class_hierarchy_v071.n3 : present umbel_abstract_concepts.n3 : present umbel_external_ontologies_linkage.n3 : present yago-class-hierarchy_en.nt : present umbel_subject_concepts.n3 : present opencyc-2008-06-10.owl : present opencyc-2008-06-10-readable.owl : present Installing dbpedia_dav.vad Error S2801: [OpenLink][Virtuoso ODBC Driver]CL033: Connect failed to LocalVirtuoso = LocalVirtuoso:1111. at line 0 of Top-Level: Running post-install scripts Error S2801: [OpenLink][Virtuoso ODBC Driver]CL033: Connect failed to LocalVirtuoso = LocalVirtuoso:1111. at line 0 of Top-Level: Install finished. Mon Sep 28 16:06:07 CST 2009 thank you Kenny On Sun, Sep 27, 2009 at 11:12 PM, Hugh Williams < >wrote: uHi Kenny, From your log it would appear that the Virtuoso server failed to start: Thus please check the dbpedia.log to see why the server failed to start, resolve reason why and try again or provide the dbpedia.log file and we can advise on why it is failing to start and how to remedy it. One point of not is to ensure their are no other Virtuoso servers running particularly on the 1111 and 8889 port used for this dbpedia installation, and also that their is no dbpedia.lck file in the installation directory which would also cause the servers failure to start. You can search the mailing list to see posts from others on using the dbpedia install script as indicated in the URL below: Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 28 Sep 2009, at 09:33, Kenny Guan wrote: uHi Hugh, I have shut done all instance service through cmd and then run the script in Cgywin. Still, I cannot start Virtuoso server. AND, the dbpedia.log file, as the dbpedia_install.sh.log indicated, cannot be found. here is the message from Cgywin: $ ./dbpedia_install.sh LocalVirtuoso 1111 dba dba Install started Mon Sep 28 20:16:19 CST 2009 Checking for VOS setup Starting Virtuoso server, please wait Virtuoso Open Source Edition (multi threaded) Version 5.0.11.3039-threads as of Apr 21 2009 Compiled for 32 Bit Windows Operating Environments Copyright (C) 1999-2009 OpenLink Software Usage: virtuoso-t [-fcnCbDARMKrBdSIm] [+foreground] [+configfile arg] [+no-checkpoint] [+checkpoint-only] [+backup-dump] [+crash-dump] [+crash-dump-data-ini arg] [+restore-crash-dump] [+mode arg] [+dumpkeys arg] [+restore-backup arg] [+backup-dirs arg] [+debug] [+pwdold arg] [+pwddba arg] [+pwddav arg] [+service arg] [+instance arg] [+manual] +foreground run in the foreground +configfile specify an alternate configuration file to use, or a directory where virtuoso.ini can be found +no-checkpoint do not checkpoint on startup +checkpoint-only exit as soon as checkpoint on startup is complete +backup-dump dump database into the transaction log, then exit +crash-dump dump inconsistent database into the transaction log, then exit +crash-dump-data-ini specify the DB ini to use for reading the data to dump +restore-crash-dump restore from a crash-dump +mode specify mode options for server startup (onbalr) +dumpkeys specify key id(s) to dump on crash dump (default : all) +restore-backup restore from online backup $ +debug allocate a debugging console +pwdold Old DBA password +pwddba New DBA password +pwddav New DAV password +service specify a service action to perform +instance specify a service instance to start/stop/create/delete +manual when creating a service, disable automatic startup The argument to the +service option can be one of the following options: start start a service instance stop stop a service instance create create a service instance screate create a service instance without deleting the existing one delete delete a service instance list list all service instances To create a windows service 'MyService' using the configuration file c:\database \virtuoso.ini: virtuoso-t +service create +instance MyService +configfile c:\database\virtuos o.ini To start this service, use 'sc start MyService' or: virtuoso-t +service start +instance MyService Cannot start Virtuoso server, please consult dbpedia.log file Started. The function gethostbyname returned error 11001 for host \"LocalVirtuoso\". Checking files load_nt.sh : present dbpedia_dav.vad : present dbpedia_post.sql : present dbpedia-ontology.owl : present umbel_class_hierarchy_v071.n3 : present umbel_abstract_concepts.n3 : present umbel_external_ontologies_linkage.n3 : present yago-class-hierarchy_en.nt : present umbel_subject_concepts.n3 : present opencyc-2008-06-10.owl : present opencyc-2008-06-10-readable.owl : present Installing dbpedia_dav.vad The function gethostbyname returned error 11001 for host \"LocalVirtuoso\"/load_nt.sh: line 2: $'\r': command not found ./load_nt.sh: line 9: $'\r': command not found ./load_nt.sh: line 28: syntax error near unexpected token `$'\r'' '/load_nt.sh: line 28: `DOSQL () cat: load_nt.sh.log: No such file or directory Running post-install scripts The function gethostbyname returned error 11001 for host \"LocalVirtuoso\". Install finished. Mon Sep 28 20:16:41 CST 2009 The function gethostbyname returned error 11001 for host \"LocalVirtuoso\". forgive my endless bother. thanks Kenny On Mon, Sep 28, 2009 at 6:09 PM, Hugh Williams < >wrote: uHi Kenny, The script is probably slightly miss leading, what the usage option refers to as \"dsn\" is not the traditional ODBC DSN, but rather the hostname:port name for the Virtuoso server instance, so you should run the installer with something like: sh dbpedia_install.sh localhost:1111 dba dba Their are some version of isql that do support ODBC as well which is probably when that term came from Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 28 Sep 2009, at 13:29, Kenny Guan wrote: uTHanks Hugh, I did mistake the ODBC DSN for \"dsn\". thx Dbpedia_install.sh calls \"virtuoso-t -c dbpeida.ini +wait\" inside and this is the first place coming up error (cannot start virtuoso). I am guessing this is a problem related to Cygwin. Cygwin may not resolve the windows environment variable. btw, i feel sorry to say i still didnot find any post about using Cygwin to install dbpedia :( Thanks Kenny On Tue, Sep 29, 2009 at 7:03 PM, Hugh Williams < >wrote: uHi Kenny, Yes, but what is in the dbpedia_install.log, load_nt.log and dbpedia.log file, as one or more of these should give the reason the server is not starting ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 29 Sep 2009, at 13:01, Kenny Guan wrote:" "Several fully funded PhD/postdoc positions in Ontologies/Data Management at TU Dresden" "uDear all: At TU Dresden, we will soon have several fully funded positions for the a major collaborative research project (DFG Cluster of Excellence SFB-912: HAEC) that has just been granted an extension till 2019. The positions are ideal for candidates who want to pursue their PhD, but could also be attractive to postdocs (the salary scale is the same and will be adjusted depending on prior work experience). See [1] for official details. The research project is a close cooperation between the research groups of Franz Baader (knowledge representation), Wolfgang Lehner (databases), and myself. Anybody who sees herself or himself at the intersection of knowledge representation, database theory, and practical data management is most welcome to apply. In particular, we are interested in ontology-based knowledge modelling on top of graph databases. English is our main language in daily communication, research, and teaching. TUD is one of the leading universities in Germany and has been identified by the German government as one of eleven \"Universities of Excellence\". Dresden is also a great place to live. Successful candidates could start almost immediately (from 1st July), but because of the very short time between final budget approval and project start, we are prepared to fill some positions only later. For the same reason, the application deadline on the job announcement is already very close [1]. However, if the positions cannot be filled in this short time, later applications will also be considered. I am happy to answer informal queries. Best regards, Markus [1] en" "Cleanded Wikipedia Category Class (CWCC) Hierarchy" "uHi, I'm curious about the \"Cleanded Wikipedia Category Class (CWCC) Hierarchy\" dataset. I read the quite short description available at \" however is there any other documentation about what exactly this is or what the status of the project is. Is someone currently working on it? Do we have some estimate of when we think a new version of the dataset might be released? In case no formal documentation exists as of today, perhaps some of the people behind the project are on this list and can share with us some informal description of what work has been done so far and what is planned for the future . Thanks /Omid uHello, Omid Rouhani wrote: There is a piece of information on the download page: \"The aim of this class hierarchy is to be close to the Wikipedia category system, but without some of its obstacles, e.g. cycles of categories, administrative categories, categories which represent instances instead of classes etc. However, the current extraction script contains some bugs and data cleansing still insufficient to be useful in applications. For this reason, the data set is not published in the SPARQL endpoint.\" Currently, no one is working on improving this data set, because we are busy with other activities. So I cannot make any estimate whether and when there will be a new version. As always, the code is publicly available in the DBpedia SVN. If you are interested in improving it, I can give you advice on what needs to be done. Kind regards, Jens" "Ontological trouble with cities" "uI just found another annoying \"feature\" of both dbpedia 3.4 and 3.5. Some cites, like this one have an rdf:type of \"dbpedia-owl:City\" as one would expect. On the other hand, there are a lot of major \"cities\" which don't have this type, such as infoboxes with useful information about these places. First I thought that a lot of the above \"cities\" aren't really \"Cities\" in the technical sense of the word, for instance, Tokyo and London are both composite entities that consist of multiple cities. (NYC really is a legal city, but the 5 boroughs are third-level administrative subdivisions!) But then I looked at some cities which aren't so ontologically challenged: \"Town Of\" Ithaca, like you often see in New York state) and found that these don't have an rdf:type of \"City\" either. The moral seems to be that the \"City\" type in dbpedia isn't all that useful currently. There is a type, however, in Freebase which is a bit more accurately assigned and roughly corresponds to \"Municipality\" in the dbpedia ontology. I was hoping to use the dbpedia ontology as the taxonomic skeleton for my next site, but now I'm left with the choice of 'bending' the dbpedia ontology (I don't feel bad asserting that Manchester is a dbpedia-owl:City but I do fret about \"Tokyo\" and \"London\") or creating my own taxonomy, which is a job I'll screw up as much as the next guy. Any thoughts?" "Please give me edit permission of Mappings Wiki" "uHello, dbpedia-discussion members. I am a Japanese graduate student who thinks that Mappings (ja) wants to be substantial. The authority of editing DBpedia Mappings Wiki for mapping is required. Would you add the following account to edit group for mapping? Username: Hoehoe User ID: 3959 Yusuke Komiyama The University of Tokyo, Graduate School of Agricultural and Life Sciences, Department of Biotechnology, Bioinformation Engineering Laboratory JSPS Research Fellow Yusuke KOMIYAMA Address: 6th Building in the Division of Agriculture, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657 JAPAN TEL:+81-3-5841-5448 Hello, dbpedia-discussion members. I am a Japanese graduate student who thinks that Mappings (ja) wants to be substantial. The authority of editing DBpedia Mappings Wiki for mapping is required. Would you add the following account to edit group for mapping? Username: Hoehoe User ID: 3959 Yusuke Komiyama The University of Tokyo, Graduate School of Agricultural and Life Sciences, Department of Biotechnology, Bioinformation Engineering Laboratory JSPS Research Fellow Yusuke KOMIYAMA Address: 6th Building in the Division of Agriculture, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657 JAPAN TEL:+81-3-5841-5448 uDone! Happy mapping and welcome to the DBpedia community! On Wed, Jul 25, 2012 at 3:40 PM, Yusuke KOMIYAMA < > wrote:" "Decoding the Freebase Quad Dump" "uI’ve written up the process used by infovore to create :BaseKB from the Freebase Quad Dump here I’m doing this now not just to demonstrate the correctness of :BaseKB, but also to demonstrate the correctness of the Freebase Quad Dump on which it is based. I’ve got concerns that other possible export formats might be “valid” RDF but may not maintain the properties that make SPARQL queries against :BaseKB function almost exactly like MQL queries against graphd. Infovore, the framework that creates :BaseKB, is available on github this, plus the above documentation, make it possible for anyone to verify these claims. Infovore contains a test suite that can be runs SPARQL queries against a triple store loaded with :BaseKB that confirms correct operation. Infovore passes all tests when run against the 2012-11-04 quad dump. A 1.0 release of infovore is in progress. This is a matter of a single patch to locate some temporary files in the right place and a small amount of additional documentation. It may take a few more days because a complete test cycle including loading into a triple store and running tests takes about 12 hours of wallclock time. People are quite aware of the value of testing of software, but it’s taken longer for people to realize that data products need compatibility testing, particularly in the RDF and semantic space. I’d like to advise Freebase to resume publication of the quad dump until it can demonstrate the correctness of any alternative data export. In fact, with infovore available under a Apache License and all of my claims independently verifiable, the freebase quad dump could remain in use indefinitely. Freebase users should demand a correct export. I’ve written up the process used by infovore to create :BaseKB from the Freebase Quad Dump here also to demonstrate the correctness of the Freebase Quad Dump on which it is based. I’ve got concerns that other possible export formats might be “valid” RDF but may not maintain the properties that make SPARQL queries against :BaseKB function almost exactly like MQL queries against graphd. Infovore, the framework that creates :BaseKB, is available on github these claims. Infovore contains a test suite that can be runs SPARQL queries against a triple store loaded with :BaseKB that confirms correct operation. Infovore passes all tests when run against the 2012-11-04 quad dump. A 1.0 release of infovore is in progress. This is a matter of a single patch to locate some temporary files in the right place and a small amount of additional documentation. It may take a few more days because a complete test cycle including loading into a triple store and running tests takes about 12 hours of wallclock time. People are quite aware of the value of testing of software, but it’s taken longer for people to realize that data products need compatibility testing, particularly in the RDF and semantic space. I’d like to advise Freebase to resume publication of the quad dump until it can demonstrate the correctness of any alternative data export. In fact, with infovore available under a Apache License and all of my claims independently verifiable, the freebase quad dump could remain in use indefinitely. Freebase users should demand a correct export. uI'm confused by the editorial comments at the end: On Fri, Nov 16, 2012 at 12:52 PM, < > wrote: Google, as far as I'm aware, is still publishing a quad dump and I've never heard any reports about it not being \"correct\" (whatever that means in this context). Can you expand on what you're trying to say? Tom I'm confused by the editorial comments at the end: On Fri, Nov 16, 2012 at 12:52 PM, < > wrote: I’d like to advise Freebase to resume publication of the quad dump until it can demonstrate the correctness of any alternative data export. In fact, with infovore available under a Apache License and all of my claims independently verifiable, the freebase quad dump could remain in use indefinitely. Freebase users should demand a correct export. Google, as far as I'm aware, is still publishing a quad dump and I've never heard any reports about it not being 'correct' (whatever that means in this context). Can you expand on what you're trying to say? Tom" "Zotero, Wikipedia and Reference Citations" "uTo All, I thought this announcement for Zotero's new support for Wikipedia was significant: Also, I have been meaning to recommend for some time the inclusion of Zotero sources as part of the Linking Open Data roadmap. This free Firefox plug-in has built-in support for most standard citation formats and can automatically extract citation information from hundreds of sources including WorldCat, CiteSeer, PubMed, Endnote, etc. A full list of the sources that Zotero extracts from may be found at: with support for these citation formats: For those of you who may not know, I have been a fan of Zotero for quite some time as an exemplar of practical user interfaces in a Firefox plug-in. While the system to date has been designed for researchers and professionals in the humanities and library sciences, I think their basic approach is easily extensible to any structured data, including RDF. Thanks, Mike BTW, by way of introduction, I am also cc'ing Dan Cohen and Josh Greenberg, two of the project leads on Zotero, with this email. Dan and Josh, I invite you to look at the mail archives of these two groups and to offer any commentary should you so choose. :) MKB uHi Michael, Thanks for make me discovering this wonderful tool! I just published a blog post that explains how Zotero could be integrated into the current Seamantic Web environment. Mr. Cohen and Mr. Greenberg: what do you think of that idea? If I refers to the development of the Semantic Radar add-on that implemented these two features (the open and the pinging) it hasn't been a big development cost and it gave a great tool to the semweb community to discover SIOC data and start creating new tools using this data. But the more interesting is certainly the fast that Zitgist would become a \"search-engine\"/\"citations-browser\"/\"citations provider\" for Zotero. Take care, Fred uHi again, Sorry but I forgot to include the link to my blog post, sorry: Salutations, Fred uHi Fred, Excellent write-up and thanks for seeing the value so quickly! I was hoping someone who actually knew what they were doing would grab Zotero and begin integrating it with the Linking Open Data vision. Thanks again. Now, all that next needs to be done is to figure out how Zotero can use any externally specified ontology as its data definition (RDF) schema! :) Mike Frederick Giasson wrote: uMike, I hear those voices whisper 'dbpedia' ;) BTW: I'm working on an UI for dbpedia to enable end-users to explore the dbpedia-dataset as well as with it linked datasets. I will hopefully have something to show next week Cheers Georgi Von: im Auftrag von Michael K. Bergman Gesendet: Do 12.04.2007 23:21 An: Linking Open Data Cc: Josh Greenberg; ; Daniel Cohen Betreff: Re: [Dbpedia-discussion] [Linking-open-data] Zotero,Wikipedia and Reference Citations Hi Fred, Excellent write-up and thanks for seeing the value so quickly! I was hoping someone who actually knew what they were doing would grab Zotero and begin integrating it with the Linking Open Data vision. Thanks again. Now, all that next needs to be done is to figure out how Zotero can use any externally specified ontology as its data definition (RDF) schema! :) Mike Frederick Giasson wrote: uMike, I hear those voices whisper 'dbpedia' ;) BTW: I'm working on an UI for dbpedia to enable end-users to explore the dbpedia-dataset as well as with it linked datasets. I will hopefully have something to show next week Cheers Georgi Von: im Auftrag von Michael K. Bergman Gesendet: Do 12.04.2007 23:21 An: Linking Open Data Cc: Josh Greenberg; ; Daniel Cohen Betreff: Re: [Dbpedia-discussion] [Linking-open-data] Zotero,Wikipedia and Reference Citations Hi Fred, Excellent write-up and thanks for seeing the value so quickly! I was hoping someone who actually knew what they were doing would grab Zotero and begin integrating it with the Linking Open Data vision. Thanks again. Now, all that next needs to be done is to figure out how Zotero can use any externally specified ontology as its data definition (RDF) schema! :) Mike Frederick Giasson wrote: uHi Michael, Thanks, next step is to make the Zotero team interested into this idea. Well, Zotero, with their RDF export, already use some ontologies to export the citations. It was looking quite good when I took a look at it, but it would be possible to change that a little bit if need to be (I think so at least, dunno the implication in their system). Take care, Fred uOn Apr 13, 2007, at 8:53 AM, Frederick Giasson wrote: Thanks so much for this discussion, and to Michael for kicking it off and bringing the parties together. I'm definitely interested in exploring this, and have some brief notes (my apologies uHi Dan," "Directly sampled importance score for DBpedia" "uThe first draft of :SubjectiveEye3D (version 0.9) is now available at under a CC-BY-SA license. :SubjectiveEye3D contains data from DBpedia 3.9. :SubjectiveEye3D is a subjective importance score derived from Wikipedia usage data published in the industry standard N-Triples format. Concepts in the ?s field align with DBpedia concepts. We \"smush\" importance along redirects, so that a topic like :Justin_Bieber gets credit for all of the page views for all of the alternative renderings of his name. :SubjectiveEye3D is based on Wikipedia usage from 2008 to 2013 so it is has an unavoidable bias towards topics people have been interested in recently (for instance, the 2012 \"Avengers\" movie kicks the 1977 \"Star Wars\" movie to the curb.) However, inside that time period, interest has been normalized on a monthly basis to counteract the bias that would be caused by the increase in usage over time. :SubjectiveEye3D is almost certainly a better importance score than the \"gravity\" which was packaged with old versions of :BaseKB, but it is only available, at the moment, aligned to DBpedia. An alignment to :BaseKB is next, since the code used to create :SubjectiveEye3D can be easily modified to do vocabulary changes, such as mapping DBpedia to :BaseKB vocabulary and vice versa. The current version is limited to topics in the English Wikipedia, however, the source material is multilingual and by merging in the correct redirect and page id files it should be possible to produce a similar product for any other language. Source code for the code that did the processing is here telepath" "Eclipse and Jena" "uhow to extract data about the museums from dbpedia using eclipse and jena and in order to do that what files i need? how to extract data about the museums from dbpedia using eclipse and jena and in order to do that what files i need? uHi Marwa, On 05/22/2013 12:46 AM, Marwa Gaper wrote: Please have a look on [1] and [2]. Hope they help. [1] [2] ?page_id=492" "Add data to dbpedia in Virtuoso" "uHi, Let's assume this page in Virtuoso: How can I add an external link to Virtuoso? (add a URL to dbpedia-owl:wikiPageExternalLink) ? Best regards, Bart uHi Bart, On 02/14/2013 12:33 PM, jazz wrote: If I understand what you want to do correctly, then your question is about how to import DBpedia data into Virtuoso. This post explains how to do this [1]. [1] uHi Mohamed, Thanks for your reply. This is indeed what I followed. Now I would like to add some information to this. Do you know how to do it? Regards, Bart On 14 Feb 2013, at 14:50, Mohamed Morsey wrote: uOn 2/14/13 8:53 AM, jazz wrote: Do you mean: I want to insert additional triples into the Virtuoso Quad Store? If so then you can use any of then methods outlined [1]. In the simplest case, you can put your triples in a turtle document and then just upload to Virtuoso via iSQL, Conductor, /sparql endpoint. Turtle Example: Turtle Start . Turtle End Links: 1. VirtRDFInsert uThanks Kingsley, I tried this: 6.2.10.9. Quad Store Upoload Offers upload to Quad Store from file or Resource URL: Figure: 6.2.10.9.1. RDF As named graph I used The file I uploaded is this: . But, still I cannot see the added link (en.wikipedia.org/wiki/Madrid) at this location: Do you know what I am doing wrong? Best regards, Bart On 14 Feb 2013, at 15:17, Kingsley Idehen wrote: uHi Jazz, I note you indicate the graph you upload the data to as being \" sparql select distinct ?g where {GRAPH ?g{?s a ?t}}; I hope this helps Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // Weblog u0€ *†H†÷  €0€1 0 +" "Problems counting all the instances of classes and subclasses" "uHi all, I have a problem querying dbpedia, mainly with timeout with the query. My goal is to count for all the classes and subclasses (if exist), their instances. (*) I have tried to use this \"ugly\" query to get the classes with their corresponding subclasses: But when I add a count on the subclasses, it turns out to an error I would really appreciate your help. TIA Best, Ghislain (*) That's the reason why I came across the previous error for rdfs:label :Sport @en in my previous mail. uTry to use an inner query to select the classes where you use a limit+offset. And then you count the instances Can you give the query, that would be easier to work on it. Julien uHi Julien, Here is the query> Can you give the query, that would be easier to work on it. SELECT DISTINCT ?sub ?label ?parent FROM WHERE {?sub rdf:type ?class. FILTER (?class=owl:Class || ?class=rdfs:Class) . ?sub rdfs:subClassOf ?parent FILTER(!isBlank(?parent) && ?parent!=owl:Thing) . OPTIONAL {?sub rdfs:subClassOf ?sub2. ?sub2 rdfs:subClassOf ?parent. OPTIONAL { ?parent owl:equivalentClass ?sub3 . ?sub3 rdfs:subClassOf ?sub2 } FILTER (?sub!=?sub2 && ?sub2!=?parent && !isBlank(?sub2) && !bound(?sub3) )} OPTIONAL { ?sub rdfs:label ?label FILTER(LANG(?label)='en')} FILTER (!isBlank(?sub) && !bound(?sub2)) } I see how to use your idea Ghislain uHi Ghislain, On 12/07/2012 11:52 AM, Ghislain Atemezing wrote: This thread [1] was discussing something similar to what you want to achieve, hope it is useful for you. [1] transitive-dbpediavirtuoso-instance-queries-in-sparql" "dbpedia tutorial" "uIs there a tutorial, with many examples, on how to extract information from DBPedia from SPARQL endpoints? If there isn't, it would be extremely helpful to have one. I would learn from it myself, and I would recommend it to students. uAll that I know of are the example links on Cheers, Max On Tue, Mar 15, 2011 at 19:51, Alexander Nakhimovsky < > wrote:" "invalid variable types in sparql query results ?" "uHello, I'm exploring the DBPedia SPARQL endpoint both interactively as well as programmatically (using the Python rdflib module). I'm trying to query persons with their birth- and death- dates. However, many of the returned dates appear ill-formatted. I see cases such as \"1920-11-03+02:00\"^^ where the given string isn't parsable as a date (and thus the rdflib automatic conversion from rdflib.term.Literal to datetime.date fails). Could someone please explain what's going on there ? Is this a known issue in the dataset ? Am I doing something wrong in my query ? Are there any workarounds (and are there plans to fix this in future updates) ? Many thanks, Stefan uHey Stefan, looks like \"T\" (timezone separator) is missing. I think the correct form should be \"1920-11-03T+02:00\". Martynas graphityhq.com On Sun, Sep 21, 2014 at 8:47 PM, Stefan Seefeld < > wrote: uHi, the data dumps available for download do not contain the \"+02:00\". For example, the retrieved from the SPARQL endpoint. However, in the original data we have \"1920-11-03\"^^ . The invalid timezone specifier seems to be introduced by Virtuoso and looks like a known problem in Virtuoso (according to [1]). Cheers, Daniel [1] 230 uMartynas, Daniel, thanks to both your replies. I have been reading a little more (notably on strings are actually valid. Note that 'T' is the time spec separator, not timezone. Timezone may contain a 'Z', which is equivalent to '00:00', but something like '+12:00' would be valid, too. So, sorry for all the noise, it seems I need to fix the software on the receiving end (which is rdflib). :-) Regards, Stefan" "Querying for keywords while discarding accents" "uHello, I am using dbpedia to work with locations in order to compare them and determine if two locations are same / similar and to what extent. Since my data source can be user input, the data normally does not match the exact resource / label defined in dbpedia. I am using the sparql endpoint for this (right now, i am using the dbpedia endpoint but i intend to use a local mirror at a later stage). I am looking to address this but still haven't found a good way to do so. I give an example here to elaborate. Take for example the region Rhône-Alpes in France. If i search for Rhone-Alpes in the label, i don't see any results. Neither in the disambiguation pages or even through the keyword search (Lookup) api. Is there a way to address this issue? I want to query such that i get the page Rhône-Alpes as one of the results when i search for Rhone-Alpes for example. This also extends to labels in different languages. My input does not specify the language so the input might be in different languages. For instance, Italia, Italy, Italie all refer to the country Italy in different languages. Thank you for any suggestions / help in advance. Best Regards, Ghufran Hello, I am using dbpedia to work with locations in order to compare them and determine if two locations are same / similar and to what extent. Since my data source can be user input, the data normally does not match the exact resource / label defined in dbpedia. I am using the sparql endpoint for this (right now, i am using the dbpedia endpoint but i intend to use a local mirror at a later stage). I am looking to address this but still haven't found a good way to do so. I give an example here to elaborate. Take for example the region Rhône-Alpes in France. If i search for Rhone-Alpes in the label, i don't see any results. Neither in the disambiguation pages or even through the keyword search (Lookup) api. Is there a way to address this issue? I want to query such that i get the page Rhône-Alpes as one of the results when i search for Rhone-Alpes for example. This also extends to labels in different languages. My input does not specify the language so the input might be in different languages. For instance, Italia, Italy, Italie all refer to the country Italy in different languages. Thank you for any suggestions / help in advance. Best Regards, Ghufran uHello, I think you are going to do some preprocessing. For example to handle accents, you can just remove them (in your program/script/) before transforming it to sparql. Some labels are present in different languages in DBpedia, maybe you could use that ? 2014-06-27 10:57 GMT+02:00 Mohammad Ghufran < >: uHi, I also think you should do some preprocessing using ASCII Folding techniques. You could fold labels and add them as additional surface forms for the entity. The same process would apply for labels coming from different languages. I have successfully used this approach in a project where Solr was used to index entities coming from DBpedia. Cheers Andrea 2014-06-27 13:08 GMT+02:00 Romain Beaumont < >: uHello, Thank you for your reply. Yes, I tried doing that. If i try to remove the accents, i normally get a redirection page in the search results. I can then get the resource uri for this result and get the actual resource page. However, this only happens sometimes. For example, a region in France called Isère has the following page: If i access the page without the accent, I am still redirected to the correct page. However, if I search for the plain string in the label, I don't get any results. Here is the query I am using: PREFIX dbpedia-owl: PREFIX rdfs: SELECT DISTINCT ?place WHERE { ?place a dbpedia-owl:PopulatedPlace . ?place rdfs:label ?label . FILTER (str(?label)= \"Isere\") . } The language is not known a-priori, as i said in my earlier message. I am trying to make my code language independent. So I cannot use the language. What is interesting is the fact that dbpedia itself redirects the url wondering how this \"magic\" is done. Mohammad Ghufran On Fri, Jun 27, 2014 at 1:08 PM, Romain Beaumont < > wrote: uDid you consider using the keyword search ( for example 2014-06-27 13:46 GMT+02:00 Mohammad Ghufran < >: uHi, there is no magic in that. It only happens that wikipedia has got a page Isere ( Isère ( Hence the framework links the two DBpedia entities together in a triple - dbpedia:Isere dbpedia-owl:wikiPageRedirects dbpedia:Isère - - However, I think this is not always true for all the pages which contain non-ASCII chars, that is wikipedia is not filled with redirects from ASCII folded pages. - - This is why in my opinion you should enrich the data with additional triples which link ASCII folded and other languages labels to the original entity, e.g. - dbpedia:Italy rdfs:label \"Italy\"@en - dbpedia:Italy rdfs:label \"Italia\"@it - and - dbpedia:Isère rdfs:label \"Isère\"@en - dbpedia:Isère rdfs:label \"Isere\"@en - - (this is just an example, I would not use rdfs:label for the ASCII folded label but another property). - - Hope this helps. - - Cheers - Andrea 2014-06-27 13:46 GMT+02:00 Mohammad Ghufran < >: uFor any type of search application, you not only want to do case and accent folding, but also Unicode normalization (you could have both precomposed and combining accent versions of the è in Isère). Typically a search engine could be directed to normalize both the text before indexing and the query. If DBpedia doesn't support this, you could look at using something like Apache Jena's SOLR-based text search support . Tom On Fri, Jun 27, 2014 at 8:00 AM, Andrea Di Menna < > wrote:" "Local Mirror Setup of English DBPedia" "uHi, What should be the minimum system configuration to setup a local mirror of English version of DBPedia? Thank you, Deepak Hi, What should be the minimum system configuration to setup a local mirror of English version of DBPedia? Thank you, Deepak uHi Deepak, On 02/25/2013 06:11 AM, deepak Naik wrote: This guide [1] describes how to establish a DBpedia mirror. I see that you can still setup a local DBpedia mirror on a less powerful machine than the one described in that guide, but it's still useful step-by-step guide for how to do that. [1]" "Comparing two nodes on dbpedia" "uHi, I'm wondering if exists some tool or (better) an online webservice which can be used to \"compare\" two nodes in dbpedia. With \"compare\" I mean something to rapidly identify which properties have in common the nodes, which are the paths from one to the other and so on. I'm not just interested in a score of similarity, but also in the clues that make two nodes similar. Cheers, Riccardo Hi, I'm wondering if exists some tool or (better) an online webservice which can be used to 'compare' two nodes in dbpedia. With 'compare' I mean something to rapidly identify which properties have in common the nodes, which are the paths from one to the other and so on. I'm not just interested in a score of similarity, but also in the clues that make two nodes similar. Cheers,    Riccardo uHi Riccardo, paths between two nodes are computed by RelFinder: Similarity as such is rather general, and probably there is no universal notion of similarity that fits all application domains and user needs. Best, Heiko Am 04.02.2015 um 14:36 schrieb Riccardo Tasso: uThanks Heiko, it's the most similar service to what I was thinking. I was just trying to compare two entities (just for my curiosity) without the ambition of calculating a \"semantic similarity\" or black magic. Just something more easy than comparing triples by hand. Also some other non graphical tool could fit my curiosity. Good work, Riccardo 2015-02-04 17:31 GMT+01:00 Heiko Paulheim < > : uTo find which properties they share in common: SELECT ?p ?o WHERE { ?p ?o . ?p ?o . } For paths between them, SPARQL does not have a native function to find a path at N hops other than property paths, which does not perform well at scalebut could potentially do the trick for youjust expect that it may take quite some time to rundepending on how far apart the nodes are. Also, property paths don't return the pathjust the length, if I remember correctly. My database does have a function for finding paths of length N and returns the \"fat\" neighborhood along that path(s), but there is no public SPARQL endpoint for itruns on a supercomputer. Does this help out? Aaron" "DBpedia Updates Integrator" "uHello Mohammed and DBpedia list, I try to use the DBpedia Live integration, but there seem to be some issues with the dumps. Some URLs are URL-encoded, though some are IRIs. That makes DBpediaLive hard to use. Same holds for the DBpediaLive Sparql endpoint [ SELECT * WHERE { ?s ?p } and SELECT * WHERE { ?s ?p } deliver different results. Is that issue known, relevant, and are there any efforts to move to IRIs someday? Someone can give a hint how to easily transform the dumps to IRIs for the between time? Best regards and a sunny weekend Magnus uHi Magnus & all, first a minor correction - IRIs. URIs don't allow non-ASCII characters, IRIs do, that's the only difference. [1][2] That being said, I'm not very familiar with DBpedia live and can only speculate about this behavior. Maybe it has to do with the changes in the URI encoding we implemented for the DBpedia 3.8 release. See Maybe the URIs using percent-encoded parentheses were produced before the code changes for 3.8? I don't know. Regards, JC [1] [2] On 26 July 2013 18:33, Magnus Knuth < > wrote: uP.S.: Maybe this script can fix the dumps: On 26 July 2013 23:29, Jona Christopher Sahnwaldt < > wrote: uHi Jona, Magnus, and all, On 07/26/2013 11:29 PM, Jona Christopher Sahnwaldt wrote: Yes, that's right, and we will migrate DBpedia-Live to the latest code very soon, so that issue should vanish. uHi Jona and all, Am 26.07.2013 um 23:29 schrieb Jona Christopher Sahnwaldt: Sure, just wanted to name the different URI representations somehow. However, both named URIs are different but refer to the very same entity. That might be a reason, but in the contemporary dumps, the old percentage-encoded non-ASCII symbols are still contained. Apparently, at least all subject-URIs are percentage-encoded, while at least some objects are referred by IRIs without percentage-encoded symbols = inconsistent. Still: Can someone give a hint how to easily transform the N3-dumps to IRIs, preferably by command line instruction? Regards Magnus uHey, sorry for abusing this thread, but somehow it belongs to it. I also discovered some problems concerning, IRI/URI escaping/non escaping Thing. Querying returns \"correct and expected\" result, but retrieved IRIs aren't valid. By changing authority Part from dbpedia.org to live.dbpedia.org results get valid. It looks like dbpedia.org is already using correct updated IRI version, but cannot cope with special chars. Where as live.dbpedia.org can but escaped Data is this still present. Is this a assumption correct, or is there a mistake in my approach? Query: PREFIX rdfs: SELECT distinct * WHERE { { ?s rdfs:label 'Alzheimer\'s disease'@en} } limit 250 Result from live.dbpedia.org [1]: s Result from dbpedia.org [2]: s Example: bash-4.1$ curl -v \" << Location: http://dbpedia.org/page/Alzheimer's_disease < bash-4.1$ curl -v \"http://dbpedia.org/page/Alzheimer's_disease\" < < HTTP/1.1 404 Not found < bash-4.1$ curl -v \"http://dbpedia.org/page/Alzheimer%27s_disease\" < < HTTP/1.1 404 Not found < Changing Server to \"live.dbpedia.org\" curl -v \"http://live.dbpedia.org/page/Alzheimer's_disease\" < < HTTP/1.1 200 OK < curl -v \"http://live.dbpedia.org/page/Alzheimer%27s_disease\" < < HTTP/1.1 200 OK < [1]http://live.dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query;=PREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0ASELECT+distinct+*++WHERE+{%0D%0A{+%3Fs+rdfs%3Alabel+%27Alzheimer\%27s+disease%27%40en}%0D%0A}+limit+250%0D%0A&should-sponge;=&format;=text%2Fhtml&timeout;=0&debug;=on [2]http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query;=PREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0ASELECT+distinct+*++WHERE+{%0D%0A{+%3Fs+rdfs%3Alabel+%27Alzheimer\%27s+disease%27%40en}%0D%0A}+limit+250%0D%0A&format;=text%2Fhtml&timeout;=30000&debug;=on Cheers Johannes uOn 29 July 2013 07:56, Magnus Knuth < > wrote: What do you mean by \"contemporary dumps\"? Can you post a link to an example file that contains inconsistent IRIs/URIs? This may help: uIn 3.8 release there were some encoding changes that were deployed in Live before they were announced and resulted in these errors. The core Live algorithm has been rewritten and is currently deployed in Dutch (live.nl.dbpedia.org). We plan to port it in English the following weeks. It will be a complete new deployment and thus we will create a new dump to remove the revision inconsistencies. Best, Dimitris On Tue, Jul 30, 2013 at 2:36 AM, Jona Christopher Sahnwaldt < uAm 30.07.2013 um 01:36 schrieb Jona Christopher Sahnwaldt: E.g. moreover some URIs contain Unicode-Encoding, which resembles yet another version of the URI, same folder 000238.added.nt.gz contains: . . According to while for Wikipedia any combination works: Anyway, since we are using IRIs, i.e. Thanks, I will check that. uHi Magnus, I think Dimitris already answered your main questions. Just wanted to add a few details with my language-lawyer hat on. On 30 July 2013 13:43, Magnus Knuth < > wrote: I see. I thought you meant there were such errors in the dumps at That's right. By the way, I'd love to move to IRIs for DBpedia English as well. Others think we should stick with URIs for backwards compatibility. That's because Wikipedia URIs are used as URLs - they are used in HTTP requests, and thus (roughly speaking) most percent-encodings don't matter. DBpedia URIs on the other hand are RDF \"URI references\" and thus define a resource based on string equality - every little difference matters. That's one of the problems of using URIs as identifiers. JC" "Fetching types for a large number of concepts" "uHi folks, What's the best way to retrieve types (rdf:type) of a large amount of concepts, in association with the concept itself? For instance: [Barack_Obama:Person, United_States:Place,] I came up with this query: SELECT ?c ?t { { a ?t. foaf:isPrimaryTopicOf ?c} UNION { a ?t. foaf:isPrimaryTopicOf ?c} UNION . . . <== repeat for all concepts } Which outputs the desired result but: 1. it's slow (takes ~2.3s for 200 concepts) 2. it doesn't go beyond 200-something concepts because apparently there's a limit on the number of nested subqueries in a query. Is there a better way to accomplish this? Regards, Parsa uHi Parsa, On 12/13/2012 01:20 PM, Parsa Ghaffari wrote: Try to use the \"IN\" operator, that might help, as follows: SELECT ?article ?type where { {?resource a ?type. ?resource foaf:isPrimaryTopicOf ?article .filter(?resource IN (dbpedia:Barack_Obama, dbpedia:United_States, )} } uYou can use the dump (nt/ttl) file and find everything in there Best, Dimitris On Thu, Dec 13, 2012 at 2:20 PM, Parsa Ghaffari < >wrote: uMohamed, That's gorgeous. After I sent the first email, I came up with: SELECT ?c ?t {?c a ?t. FILTER(?c = dbpedia:Barack_Obama || ?c = dbpedia:United_States || )} But this seems even faster (is that known as a fact?). Thanks a lot. Dimitris, Thanks but I need to serve the data through Virtuoso. On Thu, Dec 13, 2012 at 1:09 PM, Mohamed Morsey < > wrote: uOne thing though: the shorthand prefixed URI format (e.g. dbpedia:United_States) raises a syntax error when there's a % or () in the query (e.g. dbpedia:Archive_(band)) is there a way to get around that? It obviously shortens the query significantly. Thanks. On Thu, Dec 13, 2012 at 1:56 PM, Parsa Ghaffari < > wrote: uHi Parsa, On 12/13/2012 03:19 PM, Parsa Ghaffari wrote: Please have a look on that thread [1]. [1] viewtopic.php?f=12&t;=1350" "SAXParseException when testing DBPedia mappings" "uHi, I am getting a parsing exception when testing this DBPedia mapping: Exception: org.xml.sax.SAXParseException; lineNumber: 482; columnNumber: 80; The element type \"api\" must be terminated by the matching end-tag \" \". Could you suggest how to fix it? I tried removing most of the mappings, leaving only foaf:name, but was still getting the same exception. Thanks, Uldis Hi, I am getting a parsing exception when testing this DBPedia mapping: 80; The element type 'api' must be terminated by the matching end-tag ''. Could you suggest how to fix it? I tried removing most of the mappings, leaving only foaf:name, but was still getting the same exception. Thanks, Uldis" "Last Mile: 7th Int. Conf. on Developments in eSystems Engineering (DeSE 2014)" "uLast Mile Seventh International Conference on Developments in eSystems Engineering (DeSE '2014) 25th - 27th August 2014 Azia Hotel and Spa 5*, Paphos, Cyprus www.dese.org.uk Proceedings will be published by IEEE Submission Deadline: 15th June 2014 Recent years have witnessed increasing interest and development in computerised systems and procedures, which exploit the electronic media in order to offer effective and sophisticated solutions to a wide range of real-world applications. Innovation and research development in this rapidly evolving area of eSystems has typically been reported on as part of cognate fields such as, for example: Information and Communications Technology, Computer Science, Systems Science, Social Science and engineering. This conference, on the developments in eSystems Engineering will act as a platform for disseminating and discussing new research findings and ideas in this emerging field. Papers are invited on all aspects of eSystem Engineering, modelling and applications, to be presented at a three day conference in Paphos, Cyprus. Authors will have their submissions reviewed by international experts and accepted papers will appear in the conference proceedings published by the Conference Publication Services (CPS) for worldwide presence. The event provides authors with an outstanding opportunity for networking and presenting their work at an international conference. The location offers an especially attractive opportunity for professional discussion, socialising and sightseeing. DeSE 2014 conference is technically co-sponsored by IEEE. DeSE 2014 comprises an exciting spectrum of highly stimulating tracks: o eLearning (Technology-Enhanced Learning) o eGovernment systems, Autonomic Computing and AI o eBusiness and Management o eHealth and e-Medicine o eScience and Technology o eSecurity and e-Forensics o eEntertainment and Creative Technologies o eNetworking and Wireless Environments o eUbiquitous Computing and Intelligent Living o Green and Sustainable Technologies o eCulture and Digital Society o eSport Science o eSystems Engineering (Main Stream) o Sustainable Construction and Renewable Energy All submitted papers will be peer-reviewed. Accepted and presented papers will be published in the final technical conference proceedings, and will be indexed in IEEE Xplore and EI Compendex, subject to final approval by the Conference Technical Committee. Venue Azia Resort and Spa 5*, Paphos, Cyprus Built as three different sections on adjacent grounds, with plethora of magnificent spaces for carefree living, the Azia Resort has all the diversity to keep its guests contented for a week or longer. Each element of the three-in-one boutique Cyprus hotel concept has its own character and fulfils different aspirations of the visitor. The Azia Blue is about sophisticated, spacious living and family luxury hotel. The Azia Club and Spa is ideal for privacy and indulgence. One of the few resorts or Cyprus hotels with a west-facing outlook, the Azia hotel provides guests with a sublime view of blazing indescribable sunsets. Tentative Dates Submission Deadline: June 15th, 2014 Notification of Acceptance: July 15th, 2014 Camera-Ready Submission: July 31st, 2014 uLast Mile Seventh International Conference on Developments in eSystems Engineering (DeSE '2014) 25th - 27th August 2014 Azia Hotel and Spa 5*, Paphos, Cyprus www.dese.org.uk Proceedings will be published by IEEE Submission Deadline: 30th June 2014 Recent years have witnessed increasing interest and development in computerised systems and procedures, which exploit the electronic media in order to offer effective and sophisticated solutions to a wide range of real-world applications. Innovation and research development in this rapidly evolving area of eSystems has typically been reported on as part of cognate fields such as, for example: Information and Communications Technology, Computer Science, Systems Science, Social Science and engineering. This conference, on the developments in eSystems Engineering will act as a platform for disseminating and discussing new research findings and ideas in this emerging field. Papers are invited on all aspects of eSystem Engineering, modelling and applications, to be presented at a three day conference in Paphos, Cyprus. Authors will have their submissions reviewed by international experts and accepted papers will appear in the conference proceedings published by the Conference Publication Services (CPS) for worldwide presence. The event provides authors with an outstanding opportunity for networking and presenting their work at an international conference. The location offers an especially attractive opportunity for professional discussion, socialising and sightseeing. DeSE 2014 conference is technically co-sponsored by IEEE. DeSE 2014 comprises an exciting spectrum of highly stimulating tracks: o eLearning (Technology-Enhanced Learning) o eGovernment systems, Autonomic Computing and AI o eBusiness and Management o eHealth and e-Medicine o eScience and Technology o eSecurity and e-Forensics o eEntertainment and Creative Technologies o eNetworking and Wireless Environments o eUbiquitous Computing and Intelligent Living o Green and Sustainable Technologies o eCulture and Digital Society o eSport Science o eSystems Engineering (Main Stream) o Sustainable Construction and Renewable Energy All submitted papers will be peer-reviewed. Accepted and presented papers will be published in the final technical conference proceedings, and will be indexed in IEEE Xplore and EI Compendex, subject to final approval by the Conference Technical Committee. Venue Azia Resort and Spa 5*, Paphos, Cyprus Built as three different sections on adjacent grounds, with plethora of magnificent spaces for carefree living, the Azia Resort has all the diversity to keep its guests contented for a week or longer. Each element of the three-in-one boutique Cyprus hotel concept has its own character and fulfils different aspirations of the visitor. The Azia Blue is about sophisticated, spacious living and family luxury hotel. The Azia Club and Spa is ideal for privacy and indulgence. One of the few resorts or Cyprus hotels with a west-facing outlook, the Azia hotel provides guests with a sublime view of blazing indescribable sunsets. Tentative Dates Submission Deadline: June 30th, 2014 Notification of Acceptance: July 15th, 2014 Camera-Ready Submission: July 31st, 2014" "Tim O'Reilly on dbpedia" "uLast time it was just in a quote, this time Tim O'Reilly actually looked at it: The whole article is quite good, it contrasts the approaches of Semantic MediaWiki, Freebase, and dbpedia, and takes a general look at why the Semantic Web isn't here yet (because people won't do extra work without benefit). Richard" "Using DBPedia for Country Data" "uHi all, I'm new to DBPedia and am working on a project that needs statistics about countries. DBPedia seems to be the perfect resource for this data as it's all there - I just seem to be having trouble getting at it! This query worked the other day but out of the blue has stopped working and I don't know why! The query gets all of the properties for a specific country all the associated values. Some of these properties are URIs and instead of returning these I just wanted a string of text associated with that URI. With this in mind the query is split into two parts, the first which finds all of the URIs and finds the label associated with it. The second part gets all of the properties that aren't URIs. I then union the two to retrieve everything I want. SELECT ?title ?value WHERE { { ?country ?prop ?uri. ?country a . ?country rdfs:label ?label. ?prop rdf:type rdf:Property. ?prop rdfs:label ?title. ?uri rdfs:label ?value. FILTER (lang(?value) = \"en\") } UNION { ?country ?prop ?value. ?country a . ?country rdfs:label ?label. ?prop rdf:type rdf:Property. ?prop rdfs:label ?title. } FILTER (?label = \"United States\"@en && !isURI(?value)) } Individually these work but the union, which worked a couple of days ago, appears to break it. For those who are interested the working prototype can be seen at If you have any ideas what I'm doing wrong any help would be appreciated! Many thanks, Sam uOn 6/12/13 11:24 AM, Sam Esgate wrote: We'll look into what happening here. In the meantime, if you application is using DBpedia URIs, why can I actually access those URIs? For instance, have you look at some of the guidelines for attributing use of DBpedia? You simply need to do one of the following: 1. anchor labels with DBpedia URIs using 2. use via @rel in to associate your HTML pages with DBpedia URIs that are data sources for the HTML page 3. ditto using \"Link: \" response headers . In your case, I think you are close to #1 if you could replace the local identifiers with DBpedia URIs. Kingsley uCheers for your reply! Aaah, I see that the current public version of the program doesn’t pull the data to the right place on the display (so you can’t see it at the minute) but that should be fixed shortly. The issue is that I don’t particularly want the URI itself, just the text. For now we’ve got a work around which does the two queries and combines them our end which does the job so don’t worry too much - it’d just be interesting to know why it doesn’t work! Many thanks, Sam From: Kingsley Idehen Sent: ‎Wednesday‎, ‎12‎ ‎June‎ ‎2013 ‎16‎:‎50 To: On 6/12/13 11:24 AM, Sam Esgate wrote: We'll look into what happening here. In the meantime, if you application is using DBpedia URIs, why can I actually access those URIs? For instance, have you look at some of the guidelines for attributing use of DBpedia? You simply need to do one of the following: 1. anchor labels with DBpedia URIs using 2. use via @rel in to associate your HTML pages with DBpedia URIs that are data sources for the HTML page 3. ditto using \"Link: \" response headers . In your case, I think you are close to #1 if you could replace the local identifiers with DBpedia URIs. Kingsley u0€ *†H†÷  €0€1 0 + uAaaaah - that makes sense! I'll give that a go tomorrow - many thanks! Cheers, Sam On 13/06/13 12:10, Kingsley Idehen wrote:" "Problems connecting with Dbpedia Lookup web service." "uHi! I'm trying to use the Lookup web service from DBpedia, but I'm finding difficulties because there's very few information on the web. I found a code example, but when I execute it, *I get the errors: *java.lang.NullPointerException at tutorial.DBpediaLookupClient.containsSearchTerms(DBpediaLookupClient.java:111) at tutorial.DBpediaLookupClient.endElement(DBpediaLookupClient.java:87) at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(SAXParser.java:195) at tutorial.DBpediaLookupClient.(DBpediaLookupClient.java:56) at tutorial.DBpediaLookupClient.main(DBpediaLookupClient.java:126) *HERE IS THE CODE: * package tutorial; import java.io.IOException; import java.io.InputStream; import java.net.URLEncoder; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.StringTokenizer; import javax.xml.parsers.SAXParser; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParserFactory; import org.apache.commons.httpclient.HttpClient; import org.apache.commons.httpclient.HttpException; import org.apache.commons.httpclient.HttpMethod; import org.apache.commons.httpclient.methods.GetMethod; import org.xml.sax.SAXException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.DefaultHandler; import com.sun.org.apache.bcel.internal.classfile.Attribute; public class DBpediaLookupClient extends DefaultHandler { / * @param args */ private List > variableBindings = new ArrayList >(); private Map tempBinding = null; private String lastElementName = null; private String query = \"\"; public DBpediaLookupClient( String query ) throws Exception { this.query = query; HttpClient client = new HttpClient(); HttpMethod method = new GetMethod(\" try{ client.executeMethod(method); System.out.println(method); InputStream ins = method.getResponseBodyAsStream(); SAXParserFactory factory = SAXParserFactory.newInstance(); javax.xml.parsers.SAXParser sax = factory.newSAXParser(); sax.parse(ins, this); }catch (HttpException he) { System.err.println(\"Http error connecting to lookup.dbpedia.org\"); }catch (IOException ioe) { System.err.println(\"Unable to connect to lookup.dbpedia.org\"); }catch (ParserConfigurationException e) { System.out.println(\"O parser não foi configurado corretamente.\"); e.printStackTrace(); }catch (SAXException e) { System.out.println(\"Problema ao fazer o parse do arquivo.\"); e.printStackTrace(); } method.releaseConnection(); } public void startElement(String uri, String localName, String qName, Attribute attributes) throws SAXException { if (qName.equalsIgnoreCase(\"result\")) { tempBinding = new HashMap (); } lastElementName = qName; } public void endElement(String uri, String localName, String qName) throws SAXException { if (qName.equalsIgnoreCase(\"result\")) { System.out.println(\"Qname:\" + qName); if (!variableBindings.contains(tempBinding) && containsSearchTerms(tempBinding)) variableBindings.add(tempBinding); } } public void characters(char[] ch, int start, int length) throws SAXException { String s = new String(ch, start, length).trim(); if (s.length() > 0) { if (\"Description\".equals(lastElementName)) tempBinding.put(\"Description\", s); if (\"URI\".equals(lastElementName)) tempBinding.put(\"URI\", s); if (\"Label\".equals(lastElementName)) tempBinding.put(\"Label\", s); } } public List > variableBindings() { return variableBindings; } private boolean containsSearchTerms(Map bindings) { StringBuilder sb = new StringBuilder(); for (String value : bindings.values()){ sb.append(value); // do not need white space } String text = sb.toString().toLowerCase(); StringTokenizer st = new StringTokenizer(this.query); while (st.hasMoreTokens()) { if (text.indexOf(st.nextToken().toLowerCase()) == -1) { return false; } } return true; } public static void main(String[] args) { try { DBpediaLookupClient client = new DBpediaLookupClient(\"berlin\"); } catch (Exception e) { e.printStackTrace(); } } } *If anybody has any idea of what's wrong please help me. Or if any of you has any other code example it will help a lot too.* *Thank you!* Hi! I'm trying to use the Lookup web service from DBpedia, but I'm finding difficulties because there's very few information on the web. I found a code example, but when I execute it, I get the errors: java.lang.NullPointerException at tutorial.DBpediaLookupClient.containsSearchTerms(DBpediaLookupClient.java:111) at tutorial.DBpediaLookupClient.endElement(DBpediaLookupClient.java:87) at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(SAXParser.java:195) at tutorial.DBpediaLookupClient.(DBpediaLookupClient.java:56) at tutorial.DBpediaLookupClient.main(DBpediaLookupClient.java:126) HERE IS THE CODE: package tutorial; import java.io.IOException; import java.io.InputStream; import java.net.URLEncoder; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.StringTokenizer; import javax.xml.parsers.SAXParser; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParserFactory; import org.apache.commons.httpclient.HttpClient; import org.apache.commons.httpclient.HttpException; import org.apache.commons.httpclient.HttpMethod; import org.apache.commons.httpclient.methods.GetMethod; import org.xml.sax.SAXException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.DefaultHandler; import com.sun.org.apache.bcel.internal.classfile.Attribute; public class DBpediaLookupClient extends DefaultHandler { / * @param args */ private List> variableBindings = new ArrayList>(); private Map tempBinding = null; private String lastElementName = null; private String query = ''; public DBpediaLookupClient( String query ) throws Exception { this.query = query; HttpClient client = new HttpClient(); HttpMethod method = new GetMethod(' try{ client.executeMethod(method); System.out.println(method); InputStream ins = method.getResponseBodyAsStream(); SAXParserFactory factory = SAXParserFactory.newInstance(); javax.xml.parsers.SAXParser sax = factory.newSAXParser(); sax.parse(ins, this); }catch (HttpException he) { System.err.println('Http error connecting to lookup.dbpedia.org '); }catch (IOException ioe) { System.err.println('Unable to connect to lookup.dbpedia.org '); }catch (ParserConfigurationException e) { System.out.println('O parser não foi configurado corretamente.'); e.printStackTrace(); }catch (SAXException e) { System.out.println('Problema ao fazer o parse do arquivo.'); e.printStackTrace(); } method.releaseConnection(); } public void startElement(String uri, String localName, String qName, Attribute attributes) throws SAXException { if (qName.equalsIgnoreCase('result')) { tempBinding = new HashMap(); } lastElementName = qName; } public void endElement(String uri, String localName, String qName) throws SAXException { if (qName.equalsIgnoreCase('result')) { System.out.println('Qname:' + qName); if (!variableBindings.contains(tempBinding) && containsSearchTerms(tempBinding)) variableBindings.add(tempBinding); } } public void characters(char[] ch, int start, int length) throws SAXException { String s = new String(ch, start, length).trim(); if (s.length() > 0) { if ('Description'.equals(lastElementName)) tempBinding.put('Description', s); if ('URI'.equals(lastElementName)) tempBinding.put('URI', s); if ('Label'.equals(lastElementName)) tempBinding.put('Label', s); } } public List> variableBindings() { return variableBindings; } private boolean containsSearchTerms(Map bindings) { StringBuilder sb = new StringBuilder(); for (String value : bindings.values()){ sb.append(value); // do not need white space } String text = sb.toString().toLowerCase(); StringTokenizer st = new StringTokenizer(this.query); while (st.hasMoreTokens()) { if (text.indexOf(st.nextToken().toLowerCase()) == -1) { return false; } } return true; } public static void main(String[] args) { try { DBpediaLookupClient client = new DBpediaLookupClient('berlin'); } catch (Exception e) { e.printStackTrace(); } } } If anybody has any idea of what's wrong please help me. Or if any of you has any other code example it will help a lot too. Thank you! uWhich line is line 111 in DBpediaLookupClient.java? On 11 July 2013 21:11, Luciane Monteiro < > wrote:" "SKOS:Concept" "uI am a little confused by something. I understand the basics of the SKOS hierarchy but am confused when I find categories that have no \"parent\" category (for lack of a better word). For example, why doesn't Personal_Finance have a SKOS: Broader? Thanks, Ander Murane Associatedcontent.com" "Test email to list" "uThis is just a test, as my last email to the dbpedia-discussion list has not shown up in my inbox nor in the list archives at Apologies for the noise. - Sam uOn 11/05/2013, Sam Kuper < > wrote: OK, this one generated a receipt acknowledgement from the mailing list server; and I see that provides a plausible explanation for the stale state of . So I have some hope that my first message, \"Inconsistency in DBpedia?\" probably did get through to the list. If anyone is able to confirm, I would be grateful. Thanks for your understanding, and apologies again for the noise. - Sam uHello Sam, Yes the message \"Inconsistency in DBpedia?\" appears on the list, on May 11th at 4:22 AM local time (GMT+2). Regards, Christoph Am 11.05.2013 14:48, schrieb Sam Kuper: uDear Christoph, On 11/05/2013, Christoph Lauer < > wrote: Many thanks for confirming this, Sam" "After Update Queries Stopped Working" "uHi, Ever since the update of DBpedia yesterday the majority of our queries stopped working. We've noticed that queries containing snippets such as \"?artist foaf:name \"U2\".\" don't yield any results. Is this an intended behavior? If so, what would be a descent workaround? Thanks, Fabian uOn 13 Apr 2010, at 22:02, Fabian Howahl wrote: Seems to work after adding the language tag: SELECT * WHERE { ?artist foaf:name \"U2\"@en . } Actually I'm surprised it worked in 3.4 without the language tag ;-) Best, Richard" "Where is there a SCHEMA DIAGRAM and description of the database that I can access via SQL?" "uWhere is there a SCHEMA DIAGRAM and description of the database that I can access via SQL?Alan From: Kingsley Idehen < > To: ; Sent: Friday, November 4, 2016 12:39 PM Subject: Re: [DBpedia-discussion] Introducing the Ontology2 Edition of DBpedia 2016-04 On 11/4/16 1:38 PM, wrote: I am saying: A personal or service-specific instance of DBpedia has less traffic and less volatile query mix. Once is serving the world the other a specific client application / service. So, you say Pauls offering has nothing to do with a dbpedia-SPARQL-endpoint serving the world, its aim is serving 'a specific client application'. Although, it semms to be 'NICE' I register this as yor your opinion to my origin posting to Paul Thanks, baran Paul has a Virtuoso instance configured and deployed via an Amazon AMI in the AWS Cloud. Like the DBpedia SPARQL endpoint, when the Virtuoso RDBMS is up and running you can query data via SPARQL and/or SQL. You have a Virtuoso instance in an AMI vs a public Virtuoso instance that both provide access to the same DBpedia dataset.  Naturally, you could open up access to the public using the AMI variant too (and deal with the bill+++) . u0€ *†H†÷  €0€10  `†He uAlso if you are interested in the schema it is easy to query it in SPARQL. These queries generally don't work the triple server hard so they work just fine against the public SPARQL endpoint: For instance, this query will give you the types ordered by how many occurences there are of the types: select ?p ?o { ?p ?o . } gives you all the statements where the type appears on the left and select ?s ?p { ?s ?p . FILTER(?p!=rdf:type) } gives you all the statements on where the type appears on the right, omitting statements of the form \"?x a ?type.\" Diagraming the schema looks like a fun project, but the raw material to analyze the schema is all there. Note most of the predicates used for schema purposes are defined in u0€ *†H†÷  €0€10  `†He uI know. But really, SPARQL is not too different from SQL, it's based on the same math. SPARQL query result sets are almost indistinguisable from SQL result sets so you can take your whole bag of tricks from SQL to SPARQL. Any SQL <-> SPARQL translation runs into issues with cardinalities, that is, SQL aligns a single set of values for various properties across a row, so you have something like [ :personId 8831 ; :dateOfBirth \"1972-04-24\"^^xsd:date ; :firstName \"Sachin\" ; :lastName \"Tendulkar\" ; ] and for many properties the 1-1 association makes sense, but RDF standards do not enforce it. There might be multiple claims about any of these things, or maybe somebody changed their ::lastName from \"Lieber\" to \"Lee\", or their :gender from :Female to :Male, whatever. Then there are properties that are definitely multi-valued such as :wroteBook, :hadParent, etc. All of that can be modeled in SQL but the process of going from SPARQL to SQL could be lossy if you don't take the time to deal with all the problems that present themselves. (Or unless you do some trivial mapping such as as creating s,p,o columns) With SPARQL you can get the original data with complete fidelity without thinking about how the conversion works out. People who are comfortable with SQL might find that learning SPARQL is a shorter path to getting the results they want. (For one thing, you can explore with the public endpoint) u0€ *†H†÷  €0€10  `†He" "Incorrect concept's type" "uHi, There is a resource in DBpedia: Monty Python ( This is a \"comedy group\", but DBpedia organized it as the following types: http://umbel.org/umbel/rc/artist http://dbpedia.org/class/yago/britishcomedytroupes http://umbel.org/umbel/rc/comedian Following the above output, DBpedia says that: it's a PERSON (incorrect). It must be a \"comedy group\"! Why? Hi, There is a resource in DBpedia: Monty Python (http://live.dbpedia.org/page/Monty_Python) This is a \" comedy group \", but DBpedia organized it as the following types: http://www.w3.org/2002/07/owl#thing http://xmlns.com/foaf/0.1/person http://dbpedia.org/ontology/person http://dbpedia.org/ontology/agent http://schema.org/person http://dbpedia.org/ontology/artist http://dbpedia.org/ontology/comedian http://umbel.org/umbel/rc/artist http://dbpedia.org/class/yago/britishcomedytroupes http://umbel.org/umbel/rc/comedian Following the above output, DBpedia says that: it's a PERSON (incorrect). It must be a \"comedy group\"! Why? uDear Aleksander, I want to classify concepts to PERSON/LOCATION/ORGANIZATION/ETC using DBpedia. It seems that conflicts are troublesome. Would you please give me more information about how can I use this mapping for better classification in the above IE task? Describe the solution using \"Monty Python\" case. Thank you. Kind regards, Amir uDear Amir,   the task you wish to accomplish is a sense easier to do with the mapping and in some sense harder. The easier part is that the classification information is more accurate.     E.g. for Monty Python my mapping contains the following statement: <   The obscure Cyc symbol is a ComedyTeam (alas, the Semantic Web endpoint of Cyc is not working at the moment :( I checked it in my copy of OpenCyc). So this is precisely the type of Monty Python. In Cyc this generalizes to MultiPersonAgent - something similar to organization.   So the easier part is that you have the precise data. But the hard part is that you have to learn Cyc (at least some basics).    It is up to you to decide if you want to invest your time in learning Cyc or you are going to develop your own method of resolving the conflicts.   Cheers, Aleksander Dnia 13 lutego 2013 20:29 Amir Hossein Jadidinejad < > napisał(a): uHi Amir, Aleksander this could be fixed in the DBpedia mappings. The Monty Python wikipedia articles is using the Infobox_comedian [1] which is defined also for Comedy groups. The problem is that the mapping [2] maps only to the Comedian class, while it should use a ConditionalMapping, as done for the Infobox_musical_artist [3] In order to fix this I think we should: 1) Create a ComedyGroup class, which is a rdfs:subClassOf Organisation 2) Add a ConditionalMapping in [1] that maps to ComedyGroup in case current_members or past_members is set I am going to do this is a while. Cheers [1] [2] [3] 2013/2/14 Aleksander Pohl < > uDone. Now Monty Python is a ComedyGroup: BR Andrea 2013/2/14 Andrea Di Menna < >" "Literals written as URIs in the .nt files" "uHi, I'm noticing some strange output from the latest build of the Extraction Framework. Namely in the wikipedia_links and geo_coordinates output files. In the wikipedia_links I get . instead of \"de\"@de . and in geo_coordinates i get <37.98111111111111 23.733055555555556> instead of \"37.98111111111111 23.733055555555556\"@de . I tried it again on a clean server with a clean build to be sure it's not due to my meddling with the framework, but I've got the same results. Can anyone else confirm this ? Any Idea what could be causing it ? Kind Regards, Alexandru Todor uThanks for reporting this bug. It is now fixed. Cheers, Max On Mon, Jul 18, 2011 at 17:28, Alexandru Todor < > wrote:" "Very simple DBpedia query with puzzling result" "uDoes anyone have an idea why the simple query: select distinct ?x where { ?x rdfs:domain owl:Thing } returns only 11 items, as in: instead of items in: ? Vladimir Cvjetkovic Sent from Windows Mail Does anyone have an idea why the simple query: select distinct ?x where { ?x rdfs:domain owl:Thing } returns only 11 items, as in: u0€ *†H†÷  €0€1 0 +" "Bug with double infoboxes" "uHi, I don't know if it's a bug or not but when a wikipedia page like this one : Is described by two infoboxes, only the first one is analysed by the MappingExtractor, but the InfoboxExtractor analyse both. Is-it normal or not ? Best. Julien. Hi, I don't know if it's a bug or not but when a wikipedia page like this one : Julien." "dbpedia voiD bug?" "uI recently saw the voiD data for dbpedia at [1] and noticed what I believe is a bug in the voiD/scovo statistics about the dataset. The two statItems: both seem to describe the total number of triples (void:numOfTriples) of the dataset, instead of the latter describing the number of distinct subjects (void:numberOfDistinctSubjects) as it seems it is meant to. Also, I believe void:numOfTriples should be void:numberOfTriples as described in the voiD Guide[2] (although neither of these seem to be in the actual RDF for the voiD vocabulary). thanks, .greg [1] [2] guide uHi Greg, On 19 Aug 2010, at 18:22, Gregory Williams wrote: The statistics part of voiD is in flux. The DBpedia voiD file is based on a draft of the upcoming version 2 of the voiD Guide. I have committed to doing a thorough review of the DBpedia voiD file as soon as this part of the spec settles, and as part of this we will straighten out any issues like this. Best, Richard" "Linking content from DBpedia to my content (was: Please Urgent Question Need answer)" "uHi Abdullah, Excuse me for breaking into your discussion, but from what I understand, you don't need to extract anything from Wikipedia yourself. The DBpedia (Live) project have the abstracts of all Wikipedia articles; if you have a list of article titles you can just retrieve them from DBpedia. For example, if the topic/article on your list is \"Frisbee\", you use HTTP GET on , which returns RDF data about the concept . The abstract you're looking for is the object in the triple(s) conforming to the pattern dbpedia-owl:abstract [object] . (Frisbee redirects to Flying disc.) If your list is very very long, you'd better download a dump of all abstracts, load them in a triple store and use SPARQL to extract the abstracts you need, or perform your queries directly on the file using a SPARQL engine. I hope this answers your question. If the concepts on your list do not match any DBpedia resources directly, you will need to perform text searches on the abstracts (you can using SPARQL, to some degree, or perhaps the Lookup service) or perhaps even start to think of Topic Modelling, but IMHO other platforms are better suited to questions about these topics. StackOverflow, for example, has lots of question about using SPARQL with DBpedia. Good luck! Regards, Ben On 6 August 2013 19:19, Abdullah Nasser < > wrote:" "Infovore 1.0 released" "uI’m proud to announce the 1.0 release of Infovore, a complete RDF processing system * a Map/Reduce framework for processing RDF and related data * an application that converts a Freebase quad dump into standard-compliant RDF * an application which creates consistent subsets of Freebase, including compound value types, about subsets of topics that can be selected with SPARQL-based rules * a query-rewriting system that enforces a u.n.a island while allowing the use of multiple memorable and foreign keyed names in queries as does the MQL query engine * a whole-system test suite that confirms correct operation of the above with any triple store supporting the SPARQL protocol See the documentation here to get started. The chief advantage of Infovore is that it uses memory-efficient streaming processing, making it possible, even easy, to handle billion-triple data sets on a computer with as little as 4GB of memory. Future work will likely focus on the validation and processing of the official Freebase RDF dump as well as other large RDF data sets. I’m proud to announce the 1.0 release of Infovore, a complete RDF processing system * a Map/Reduce framework for processing RDF and related data * an application that converts a Freebase quad dump into standard-compliant RDF * an application which creates consistent subsets of Freebase, including compound value types, about subsets of topics that can be selected with SPARQL-based rules * a query-rewriting system that enforces a u.n.a island while allowing the use of multiple memorable and foreign keyed names in queries as does the MQL query engine * a whole-system test suite that confirms correct operation of the above with any triple store supporting the SPARQL protocol See the documentation here processing, making it possible, even easy, to handle billion-triple data sets on a computer with as little as 4GB of memory. Future work will likely focus on the validation and processing of the official Freebase RDF dump as well as other large RDF data sets." "Obvious wrong entity types" "uHi all, I've noticed some amount of incorrectly classified entities following similar pattern. Try out following query (before exec. set the exec. timeout to e.g. 80000): SELECT ?s (COUNT(?s) AS ?count ) { ?s ?p ?o . ?s a . FILTER(isIRI(?o)) } GROUP BY ?s ORDER BY DESC(?count) Results: 651 642 638 601 573 564 http://dbpedia.org/resource/2010_Clube_de_Regatas_do_Flamengo_season 447 http://dbpedia.org/resource/2010_Santos_FC_season 413 http://dbpedia.org/resource/List_of_Mexican_football_transfers_summer_2012 377 http://dbpedia.org/resource/List_of_Mexican_football_transfers_summer_2014 373 http://dbpedia.org/resource/G._K._Chesterton 371 Note the List_xxx DBpedia resource classified as person. Maybe there is some rationale behind - classifying \"lists of persons\" as \"person\" but IMO this is misleading information for end-users looking for persons. A more appropriate type might be \"List of people\" or \"List of footballers\". Also, I wonder what is the provenance of this class? I believe its not from a infobox info but more from some machine learning algorithm. Is it possible to track the provenance? Moreover, it could be possible to filter-out such resources using the Wikipedia categories info, however, this info is not available for the lists. In Wikipedia there are 3 categories assigned for https://en.wikipedia.org/wiki/List_of_Iranian_football_transfers_summer_2012 but non of them in the corresponding DBpedia resource http://dbpedia.org/page/List_of_Iranian_football_transfers_summer_2012 Any thoughts on this? Thanks, Milan" "flickr wrapper and image copyright ownership" "uHello List, first: i think the flickr wrapper is a great idea, so thank you for coding this. the example[1] does not link back to the original flickr photo page and does not include the copyright information. while the flickr api tos[2] doesn't talk about that explicitly it's good habit[3] for flickr api users and mentioned in the community guidlines[4]: \"Do link back to Flickr when you post your photos elsewhere\". i can imagine some flickr users (including me) could be concerned about usage of their images in another context without telling about copyright ownership. imho[5] you have to tell people about the source if you use it (for free). Nils K. Windisch [1] [2] [3] [4] [5] search?hl=enq=define%3A+imho uHi Nils, Sorry for the late reply. Thanks for the suggestion - it makes a lot of sense and I somehow missed it. I went ahead and implemented it - the photo page on flickr is now referenced by foaf:page links (credits go to Richard for proposing this), and it is linked to the images in the HTML display. Cheers, Christian" "a query with count returns different results from the same query without count" "uHi I have just discovered that a query with count have a behaviour that is different from the same query without count. Now, I might be missing something, but here is what happens. My understanding is that COUNT only counts the statements that would otherwise be returned using the same SPARQL query without COUNT. Query: PREFIX onto: PREFIX rdf: PREFIX rdfs: SELECT DISTINCT ?uri WHERE { ?cave rdf:type onto:Cave . ?cave onto:location ?uri . OPTIONAL {?uri rdfs:label ?string. FILTER (lang(?string) = 'en') } } HAVING (count(?cave) > 2) returns 18 results now the same query but with count PREFIX onto: PREFIX rdf: PREFIX rdfs: SELECT count(DISTINCT ?uri) WHERE { ?cave rdf:type onto:Cave . ?cave onto:location ?uri . OPTIONAL {?uri rdfs:label ?string. FILTER (lang(?string) = 'en') } } HAVING (count(?cave) > 2) returns 156 it seems that it ignores the part with HAVING(count(?cave)>2) as PREFIX onto: PREFIX rdf: PREFIX rdfs: SELECT count(DISTINCT ?uri) WHERE { ?cave rdf:type onto:Cave . ?cave onto:location ?uri . OPTIONAL {?uri rdfs:label ?string. FILTER (lang(?string) = 'en') } } returns 156 as well" "Dereferencing URIs using Jena." "uHi! I need to dereference *URIs*, like: and returned from *Spotlight*, to get their *RDFS:comments* property. Is it something like this: Model model = ModelFactory.createDefaultModel(); model.read( \" But how do I get the property? Thank you! Hi! I need to dereference URIs , like: you! uOn Fri, Nov 8, 2013 at 7:50 AM, Luciane Monteiro < > wrote: I doubt a Jena question really belongs on the DBpedia mailing list. This question doesn't essentially depend on DBpedia at all; it's just about retrieving some data using Jena and then querying a Jena model. It would be more appropriate to ask on the Jena users mailing list. At any rate, Once you have the model, and since you know that the resource is like this (untested): Resource google = model.createResource( \" StmtIterator stmts = google.listProperties( RDFS.comment ); while ( stmts.hasNext() ) { Statement stmt = stmts.next(); RDFNode comment = stmt.getObject(); /* do something with the comment */ } //JT uSorry for posting in the wrong place, I was so concerned about asking, that I didn't realize. Thank you very much for the help. 2013/11/8 Joshua TAYLOR < >" "Problem for extract data" "uHi, I have some problems to extract datas from wikipedia dump. I have this exception : Java.lang.exception : No dump found for Wiki : commons But I put the files \"commonswiki-20110427-pages-articles.xml\" and “commonswiki-20110427-pages-articles.xml.bz2” in the \"C:/wikipediaDump/commons\" and here is my config file : dumpDir=C:/wikipediaDump outputDir=C:/output updateDumps=false extractors=org.dbpedia.extraction.mappings.MappingExtractor extractors.fr=org.dbpedia.extraction.mappings.MappingExtrector languages=fr The datas that I want to extract are the datas of this infobox : What have I done wrong ? Thanks in advance. Julien Plu. uHello, I have still the same problem. Anyone can help me ? It’s very important. Thanks in advance. Julien. De : Julien PLU [mailto: ] Envoyé : samedi 30 avril 2011 15:08 À : ' ' Objet : Problem for extract data Hi, I have some problems to extract datas from wikipedia dump. I have this exception : Java.lang.exception : No dump found for Wiki : commons But I put the files \"commonswiki-20110427-pages-articles.xml\" and “commonswiki-20110427-pages-articles.xml.bz2” in the \"C:/wikipediaDump/commons\" and here is my config file : dumpDir=C:/wikipediaDump outputDir=C:/output updateDumps=false extractors=org.dbpedia.extraction.mappings.MappingExtractor extractors.fr=org.dbpedia.extraction.mappings.MappingExtrector languages=fr The datas that I want to extract are the datas of this infobox : What have I done wrong ? Thanks in advance. Julien Plu. uHi, sorry about the late reply. On Sat, Apr 30, 2011 at 15:07, < > wrote: You need the date of the XML dump as a directory. Is the file \"commonswiki-20110427-pages-articles.xml\" in the directory C:\wikipediaDump\commons\20110427 ? Cheers, Max uHi, Now it works fine. Thank to you. Cheers. Julien." "HTTP response" "uHi, Have the rules of engagement changed for dbpedia? When I use curl to request RDF+XML using the Accept header, I get back a 301. Shouldn't I get a 304? :- C:\Program Files\curl-7.18.2>curl -I -H \"Accept: application/rdf+xml\" HTTP/1.1 301 Moved Permanently Date: Tue, 15 Feb 2011 19:53:48 GMT Server: Apache/2.2.16 Location: Connection: close Content-Type: text/html; charset=iso-8859-1 Richard uPlease ignore this: I failed to notice that I still had the \"www.\" in the URL when I finally got the curl syntax right A 301 is just what it should return for this URL. (And yes, it should have been a 303 I was after, not a 304.) Richard In message < >, Richard Light < > writes" "Free Linked Open Data Webinar on DBpedia Spotlight" "u(Apologies for cross-posting) What: LOD2 webinar series - DBpedia Spotlight Who : LOD2 project, presenter Pablo N. Mendes On 26.02.2013, 04.00pm CET the LOD2 project ( free one-hour webinar on \"DBpedia Spotlight\" (Open Knowledge Foundation, Germany, Pablo N. Mendes) This webinar in the course of the LOD2 webinar series will present DBpedia Spotlight which is a tool employed in the Extraction stage of the LOD Lyfe Cycle, performing Entity Recognition and Linking. The tool can be used to enable faceted browsing, semantic search, among other applications. In this webinar we will describe what is DBpedia Spotlight, how it works and how can you benefit from it in your application, and as a part of the LOD2 stack. If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the LOD2 webinar series! When : 26.02.2013, 04.00pm - 05.00pm CET Information & Registration: The LOD2 team is looking forward to meeting you at the webinar!! All the best and have a nice day! (Adapted from the announcement posted by by Thomas Thurner, Head of Transfer, Semantic Web Company GmbH)" "Kudos and basic statistics" "uTo All, DBpedia is most impressive and deserves more attention than it's gotten to date (esp. viz Freebase). I've pretty well scoured the Web site, but still have a couple of naive questions: 1. Is there a reference to how the information extraction from Wikipedia actually occurs? 2. What is the (estimated) unique count for properties across the database (saw 8 K for the Infobox portion)? Are DBpedia's unique properties directly comparable to YAGO's unique relations (14)? 3. Across DBpedia, what is the (estimated) unique count of instances? Again, how does that relate to terms such as concepts, instances or facts? 4. I'm intrigued with what possibility this team sees for YAGO (or other approaches as well, frankly) to improve DBpedia's classification structure (and, by implication, what might be perceived as weak with the current approach)? I know that all of this goes well beyond a numbers game, but I have a hard time seeing how expressing \"size\" using triples really means anything other than the raw content that needs to be stored and indexed. Aren't relations/properties + entities/concepts/instances better ways of expressing actual information coverage? Sorry to be a pest. :) I'm trying to get up to speed on some of this stuff so I can help explain it better to others. And, again, great work! Thanks, Mike uHello Mike, Michael K. Bergman schrieb: We wrote a paper about the infobox extraction: However, DBPedia is more than that and as far as I know there exists no general overview of all used techniques so far (not surprising as it is a fairly new project). I am not completely sure, but I think the number (8K) should be approximately correct since most properties come from infoboxes. (I just mean the overall count of different properties. Of course properties like skos:broader or geonames:featureCode appear very frequently.) No. YAGO extracts properties from the Wikipedia categories and Wikipedia redirects. Naturally, this allows to extract only a small amount of different properties. For instance, Wikipedia categories allow to extract \"subclass of\" or \"born in year\" (from categories like 1920_births) relations. YAGO does not perform infobox extraction, which is the reason why their property count is relatively small. You can find more information at: For infoboxes these statistics can be found in the first paper above. There has been some interesting discussion with the YAGO people. I think their data could be a useful addition to DBPedia. They have extracted a classification hierarchy from Wikipedia combined with WordNet. DBPedia does not have a proper classification hierarchy yet, but we are working on extracting such a hierarchy from Wikipedia categories (the Wikipedia categories itself are not always compatible with rdfs:subClassOf). The other relations extracted by the YAGO algorithm are imho not present in DBPedia, so it would be interesting to combine the data sets. Jens uHi Mike, I would guess that we have around 1.6 million dbpedia instances in the dataset, plus around 100 000 references to instances within other datasets. Depends on your definition of the tree terms above. Using RDF-S terms: We have around 20 classes mostly people and geographic locations. Around 10 000 skos:Concepts derived from the Wikipedia category system. Around 1.6 million instances derived from individual articles of the English Wikipedia. Around 90 million facts, if you equal this term with RDF triple. Orri is working on a COUNT() extension for Virtuoso's SPARQL engine. When he is done, we will be able to provide proper statistics. Cheers Chris uThanks, all. The responses to my questions have been great. One quick follow-on: Can anyone tell me what the size (in MB) of the complete data files that can be downloaded from the Web site are unzipped? BTW, I should have something to post on this within the next day. Thanks, Mike Chris Bizer wrote: [snip] [snip]" "Loading full DBpedia dataset, timings and approach" "uHi, I'm trying to create a local mirror of DBpedia using Openlink Virtuoso open source edition and it seems to be taking a number of days to load the data. I've used the DBpedia load scripts previously submitted on the list by Hugh Williams (using ttlp_mt). I've so far managed to load : articlecategories_en.nt infobox_en.nt redirect_en.nt articles_label_en.nt infoboxproperties_en.nt shortabstract_en.nt disambiguation_en.nt longabstract_en.nt wikipage_en.nt externallinks_en.nt pagelinks_en.nt The load time seems to increase massively as files are loaded. I've used the virtuoso configuration settings from be adding the extra recommended indexes from This results in a 15Gb virtuoso db file which seems to take an inordinately long time to query. I was wondering if there are any recommended techniques for loading a full dataset e.g. are there any methods for partitioning of data to improve load and query times ? Could anyone else let me know how long their load times are for a full core dataset (or even just the en version) ? My server is a 2Gb, AMD Athlon BE-2400 (dual core) - Is this 'good' enough, should I invest in more memory ? Any suggestions would be really appreciated, Thanks, Rob uHi Rob, I might be wrong here or there might be some things that can be done better, but this is my experience with loading datasets in Virtuoso. My load took close to a day. If this is any indication that might be useful, my total Virtuoso db size came to 12Gb; I did not choose some of the datasets you loaded. I also only loaded the en set. From my own estimates I figure that a full load of multiple languages on my machine is going to take a *long* time. I got the same, almost exponentially increased load times the more datasets I loaded. The configuration settings (your first link) definitely improved the loading performance for me. My load would have gone on for a couple of days if those configuration settings were not used. By the way, I loaded mine onto a CoreDuo (1st-gen Intel) MacBook Pro, running at 2.16GHz, and with 2Gb of RAM – not too different from your specs. The hard disk on your server should run faster, and you'll probably get faster queries. Virtuoso runs very well for what I'm applying it for (an art project that requires simple SPARQL queries over the interval of 30 seconds or so). However, on top of running the Virtuoso server I'm also executing more CPU-intensive Java applets that send my cooling fans into a frenzy. I haven't had a chance to monitor CPU load yet but even with the extra load the queries perform satisfactorily. I'm calling the queries via POST methods from my Java app to the sparql endpoint. What kind of queries are you calling? Make sure you also include the default-graph-uri= all datasets in it's databases for the query. Hope this helps, Andrew On Apr 18, 2008, at 2:21 PM, robl wrote: uHi Guys, On the same topic, I have myself been using Allegro Graph to load dbpedia data sets with hardware 2.16 GhZ, 2GB RAM, single core, using a java client on a network. I have been able to beat the benchmark results for Virtuoso which happened to be the fastest in the results published on times(If anybody is interested, i'll be happy to share the logs). For all the five queries i almost got the same results as in the benchmarks. Now the problem comes when i start doing the regex filter queries. The performance goes down the hill. For a very simple query like PREFIX p: SELECT ?x WHERE {{ ?x ?y ?z } FILTER regex(str(?x), \"kevin\", i)} it takes close to 3-4 minutes. I also loaded few other actual dbpedia datasets and been getting 4-5 minutes for the regex queries. Has anybody done any benchmarks on \"regex\" queries, since my suspicion is that they are making the querying slower. Are there any alternatives out there to search for keywords in resources which are faster and dont use \"regex\". Or if anybody had any faster results with any other systems like Sesame and Jena with large data sets and getting faster querying results. I would appreciate any help in this regard. thanks -amit On Fri, Apr 18, 2008 at 6:13 PM, Andrew (Chuan) Khoo < > wrote: uKool dude wrote: That's no wonder - regex queries can not be optimized by (any) database. We were using MySQL's full-text search indexing or the SQL LIKE operator (with fixed query prefix), which results in much better performance - I think Virtuoso offers something similar. Best, Sören uKool dude wrote: Amit, If you are going to make benchmark claims about Virtuoso, please do the following: 1. Publish you infrastructure specs 2. List your queries and timings If you are beating Virtuoso 2-3 times, then I think the whole world needs to be able to assimilate this in empirical form :-) And of course, we will respond to your empirical data accordingly :-) Kingsley uKingsley Idehen wrote: Hi, I just thought it might be worth mentioning the mailing lists as a potential source of information and place to ask questions for Virtuoso DBpedia users : and the wiki : (It would be nice if we could update the wiki though !) Are there any other useful virtuoso resources out there ? Thanks, Rob urobl wrote: Yes, here is the actual true Wiki style Wiki: We also have discussion forums at: Of course: 1. 2. Kingsley" "Wrong type in RDF & SPARQL view" "uHi, While writing a federated SPARQL query I noticed an error in a data set: I'm using dbpedia-owl:municipalityCode, which according to the HTML view is \"138 (xsd:integer)\" But both SPARQL and dereferencing RDF disagree: SPARQL: Same with rapper: dbpedia-owl:municipalityCode \"0138.046\"@en ; Anyone got an idea how the 0.046 got added there? Obviously this messes up comparing it to a real integer value (which it is supposed to be). This was the only such value I found while querying for Swiss municipalities, all others seem to be fine. cu Adrian uHi Adrian, On 02/09/2013 10:34 AM, Adrian Gschwend wrote: the one which equals \"138\"^^ is dbpprop:municipalityCode not dbpedia-owl:municipalityCode. the value of dbpedia-owl:municipalityCode is \"0138.046\"@en, which is the same as the one already in the source article [1]. [1] index.php?title=Samstagern&action;=edit uOn Sat, Feb 9, 2013 at 5:22 AM, Mohamed Morsey < > wrote: That seems very wrong. Where is this mapping/extraction defined? And why are there two different mappings? That seems not quite as wrong, but still wrong since the identifier string doesn't really have an associated language. Tom [1] > On Sat, Feb 9, 2013 at 5:22 AM, Mohamed Morsey < > wrote: the one which equals '138'^^< index.php?title=Samstagern&action=edit uOn 02/09/2013 05:01 PM, Tom Morris wrote: Theres is only one mapping for \"Infobox Swiss Town\", and infobox attribute \"municipality_code\" is mapped to DBpedia property \"municipalityCode\". This value is extracted by the \"InfoboxExtractor\" which depends on some heuristics to infer the datatype of the extracted information. Similar issue was discussed here [1]. You are right in that. [1] forum.php?thread_name=CA%2Bu4%2Ba0R5AXEsem%2BE1QJdnq2cLH2v9%2B%2B%3DSrCg08eYQdoEVu4Og%40mail.gmail.com&forum;_name=dbpedia-discussion uOn 09.02.13 11:22, Mohamed Morsey wrote: Hi Mohamed, oh totally oversaw that one! Should not ignore the prefix ;) Never noticed that there are in fact two of them, thanks for the hint! you are right, will fix that in the article, this has to be an integer and the value is indeed 138, which is the official ID of Richterswil. I'm working on RDF views of Federal Statistical Office data so I checked there. Also I will link to the DBpedia articles, that's why I use it. cu Adrian uOn 09.02.13 17:20, Mohamed Morsey wrote: Hi Tom, hi Mohamed, ok but is there any reason that there are two properties? yeah that was confusing me too, had to cast it to integer in SPARQL to be able to map that. Could this be fixed? cu Adrian uOn 9 February 2013 18:21, Adrian Gschwend < > wrote: some/all infobox properties, and the dbpedia-owl properties have been explicitly mapped. Ben uHi Adrian, On 02/09/2013 06:21 PM, Adrian Gschwend wrote: Actually there are 2 extractors which extract data from an infobox, specifically the \"InfoboxExtractor\" and the \"MappingExtractor\". The InfoboxExtractor, extracts all infobox attributes and uses the attribute name as the property name in DBpedia. For example in [1], there is the following attribute: | languages = German this attribute name is used as is, and represented in the \" \" The MappingExtractor, extracts structured data based on hand-generated mappings of Wikipedia infoboxes to the DBpedia ontology. It then extracts the value of each Wikipedia property defined for that type of infobox, and generates an appropriate triple for it, based on those mappings. The DBpedia mappings is available at [2]. You can find more information about the extractors here [3]. [1] [2] [3] Extractor uHi Ben, On 02/09/2013 06:48 PM, Ben Companjen wrote: yes, and more information is available here [1]. [1] Extractor uOn 09.02.13 20:21, Mohamed Morsey wrote: Hi Mohamed, [] great, thanks for clarifying! cu Adrian" "Invalid character in infobox-mappingbased-loose.nt when trying to load DBPedia in virtos" "uMarvin Lugair < > wrote on 2008-11-20 00:27: This is kind of late, but I encountered the same problem when trying to load infobox-mappingbased-loose.nt into rdflib and found it was an unescaped space in a URI: . Change the space to %20." "Help needed with DBPedia Live mirror setup on Mac OS X" "uHi, I am trying to set up a DBpedia Live Mirror on my personal Mac machine. Here is some technical host information about my setup: Operating System: OS X 10.9.3 1. Downloaded the initial data seed from DBPedia Live at 2. Downloaded the synchronization tool from 3. Executed the virtload.sh script. Had to tweak some commands in here to be compatible with OS X. 4. Adapted the synchronization tools configuration files according to the README.txt file as follows: a) Set the start date in file \"lastDownloadDate.dat\" to the date of that dump (2013-07-18-00-000000). b) Set the configuration information in file \"dbpedia_updates_downloader.ini\", such as login credentials for Virtuoso, and GraphURI ( 5. Executed \"java -jar dbpintegrator-1.1.jar\" on the command line. This script repeatedly showed the following error: INFO - Options file read successfully INFO - File : INFO - File : WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.removed.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error:  (No such file or directory) INFO - File : WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.added.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error:  (No such file or directory) INFO - File : INFO - File : http://live.dbpedia.org/changesets/2014/06/16/13/000002.removed.nt.gz has been successfully downloaded INFO - File : /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000002.removed.nt.gz decompressed successfully to /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000002.removed.nt WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement Questions 1) Why do I repeatedly see the following error when running the Java program: \"dbpintegrator-1.1.jar\"? Does this mean that the triples from these files were not updated in my live mirror? WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.removed.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error:  (No such file or directory) 2) How can I verify that the data loaded in my mirror is up to date? Is there a SPARQL query I can use to validate this? 3) I see that the data in my live mirror is missing wikiPageId (http://dbpedia.org/ontology/wikiPageID) and wikiPageRevisionsID (http://dbpedia.org/ontology/wikiPageRevisionID). Why is that? Is this data missing from the DBpedia live data dumps located here (http://live.dbpedia.org/dumps/)? Please do let me know. Thanks! ~ Shruti Software Engineer, San Francisco Area Hi, I am trying to s et up a DBpedia Live Mirror on my personal Mac machine. Here is some technical host information about my setup: Operating System: OS X 10.9.3 Processor 2.6 GHz Intel Core i7 Memory 16 GB 1600 MHz DDR3 Database server used for hosting data for the DBpedia Live Mirror: OpenLink Virtuoso (Open-source edition: https://sourceforge.net/projects/virtuoso/) Here's a summary of the steps I followed so far: 1. Downloaded the initial data seed from DBPedia Live at http://live.dbpedia.org/dumps/: dbpedia_2013_07_18.nt.bz2 2. Downloaded the synchronization tool from http://sourceforge.net/projects/dbpintegrator/files/. 3. Executed the virtload.sh script. Had to tweak some commands in here to be compatible with OS X. 4. Adapted the synchronization tools configuration files according to the README.txt file as follows: a) Set the start date in file \"lastDownloadDate.dat\" to the date of that dump (2013-07-18-00-000000). b) Set the configuration information in file \"dbpedia_updates_downloader.ini\", such as login credentials for Virtuoso, and GraphURI (http://live.dbpedia.org). 5. Executed \"java -jar dbpintegrator-1.1.jar\" on the command line. This script repeatedly showed the following error: INFO - Options file read successfully INFO - File : http://live.dbpedia.org/changesets/lastPublishedFile.txt has been successfully downloaded INFO - File : http://live.dbpedia.org/changesets/2014/06/16/13/000001.removed.nt.gz has been successfully downloaded WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.removed.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error: (No such file or directory) INFO - File : http://live.dbpedia.org/changesets/2014/06/16/13/000001.added.nt.gz has been successfully downloaded WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.added.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error: (No such file or directory) INFO - File : http://live.dbpedia.org/changesets/lastPublishedFile.txt has been successfully downloaded INFO - File : http://live.dbpedia.org/changesets/2014/06/16/13/000002.removed.nt.gz has been successfully downloaded INFO - File : /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000002.removed.nt.gz decompressed successfully to /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000002.removed.nt WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement WARN - null Function executeStatement Questions 1) Why do I repeatedly see the following error when running the Java program: \" dbpintegrator-1.1.jar \"? Does this mean that the triples from these files were not updated in my live mirror? WARN - File /Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.removed.nt.gz cannot be decompressed due to Unexpected end of ZLIB input stream ERROR - Error: (No such file or directory) 2) How can I verify that the data loaded in my mirror is up to date? Is there a SPARQL query I can use to validate this? 3) I see that the data in my live mirror is missing wikiPageId (http://dbpedia.org/ontology/wikiPageID) and wikiPageRevisionsID (http://dbpedia.org/ontology/wikiPageRevisionID). Why is that? Is this data missing from the DBpedia live data dumps located here (http://live.dbpedia.org/dumps/)? Please do let me know. Thanks! ~ Shruti Software Engineer, San Francisco Area uHi Shruti, I'm not a mac user so I can't help you since I can't reproduce your issues. Might i suggest you try to use the tool in a Ubuntu virtual machine to see if the issues you are reporting are coming from the dbpintegrator or from OSX. A tool that makes it easier for you create VM's on Mac OS is Vagrant[1] or just plain Virtualbox. Cheers, Alexandru [1] On Thu, Jun 19, 2014 at 7:34 PM, A Shruti < > wrote: uHi Alexandru, Thanks for getting back to me. I understand you cannot help answer question #1 since you are unable to reproduce the problem. However, can you possibly help answer questions #2 and #3? Thanks, Shruti On Thursday, June 19, 2014 11:20:02 AM, Alexandru Todor < > wrote:" "dbpedia 3.4 import virtuoso 5.11, some date is missing" "uhi folks, i´m trying to import dbpedia 3.4 en + de into virtuoso. i use a script comming around on the virtuoso maillist. the script tries to import each line of the *.nt files separate. some lines/uri´s are bad, containing whitespaces and they are skipped, no problem. but i´m wondering, i miss data in virtuoso which seems right. maybe you can post your importscripts or describe your importprocess? another question, what is the difference between geo-coords en and de? is the de file a subset of en? cheers, armin uHi Armin I presume you are referring to the following script: If so you might want to try using changing the following in the install_nt.sh script: ttlp_mt (file_to_string_output ('$f'), '', '$g', 17); to : ttlp_mt (file_to_string_output ('$f'), '', '$g', 255); The last param being the flags bit mask for controlling the strictness of parsing by the ttlp_mt functions as detailed at: The current online DBpedia 3.4 instance is hosted in a Virtuoso v6 clustered server for performance and faceted browsing support and a different script to the one above was used for loading the datasets, which we are considering documenting and making available for public use. Although the above script should equally work especially with the v5 server it was originally written for. Someone form the DBpedia datasets team will have to comment on differences between the en and de geo-coords dataset files Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 26 Nov 2009, at 11:58, Armin Nagel wrote: uHi Armin, the GeoExtractor [1] runs for all languages. It looks for several known templates to extract coordinates . As with all other etxractors, we only use those non-English articles that have an interwiki link to an English article (because we need to use the same dbpedia URIs for all languages). So the German articles we use are basically a subset of the English articles. Since the coordinates data should be pretty much the same in all languages, this means that geo_de is basically a subset of geo_en, except where the English and German Wikipedia disagree about the coordinates of a place. Schönen Gruß nach Mitte, :-) Christopher [1] On Thu, Nov 26, 2009 at 12:58, Armin Nagel < > wrote:" "minor problem w/ faceted search web interface" "uHi DBpediaers, I was innocently enough looking to find uris related to Lil' Louis' 1989 house anthem French Kiss. I entered the search term \"Lil'Louis French Kiss\" in the faceted search interface here Note my search term includes a ' This caused an error it seems. Removing the ' and replacing w/ whitespace resulted in a successful search giving me the results i wanted. Perhaps such characters should be escaped / replaced / filtered somehow? Cheers, Kurt J uHi Kurt, Have you tried searching for: Lil'Louis French Kiss ie without and doubles quotes around it, as that returns what look like 6 meaningful results ? Although I do agree that the query with double quotes should also have worked, thus this is something we need to check at the Virtuoso end where the DBpedia service is hosted. Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 19 Aug 2009, at 12:35, Kurt J wrote: uactually this is what i meant, original search had no double quotes. mysteriously this string give no error however using just: Lil'Louis returns the error. -kurt uHi Kurt, Mysterious indeed, we are looking into it Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 19 Aug 2009, at 14:22, Kurt J wrote:" "Introducing Gazetiki (geographical database)" "uHello, Gazetiki is a geographical database which was constituted at CEA LIST ( In order to improve Gazetiki, your feedback and questions are most welcome.  Best regards, Adrian Popescu From: Mihály Héder < > To: Yury Katkov < > Cc: Sent: Thursday, 6 October 2011, 11:45 Subject: Re: [Dbpedia-discussion] Introducing Sztakipedia Hello! 1) The toolbar is really a MediaWiki user script (javascipt), not a browser extension or something, and you can enable in your account right now. Check pedia.sztaki.hu \"Enable it in your account\". It communicates with a server endpoint which is provided by us and is totally public and free (but is in beta! Could not test it with crowd load yet). Behind that endpoint there are a couple of servers: UIMA, Solr(Lucene) and other stuff. That stuff is not a beast you just download and install, but you don't need it anyway. 2) Well, your second question is a harder one. What I can promise that we will come up with some general version you can use but with less functionality. -the categorization relies on Yahoo search. As long as yahoo indexes the Wiki of your preferred language we can make it work. (A long-term issue is that we have to pay a small amount for it - some 4$ / 10K search - I will try to find someone at yahoo and ask for their support for instance in exhange for putting their logo in the suggestion window. But right now I don't even have a contact to them.) -Link recommendation relies on tf-idf data and the dbpedia data. To do the tf-idf calculation we need the xml dump of a certain wiki and run some scripts. It takes about a week in case of the english wiki, others are of course much smaller BUT we need some kind of stemmer or lemmatizer to the given language - preferably one which we can integrate with UIMA. We already have integrated snowball, so in theory we are able to process any language snowball supports ( don't do the stemming, in theory tf-idf can still work but problems arise with languages like Hungarian - where we concatenate funky suffixes to the words to signal past tense, posessive, modalities, etc>From dbpedia we use the list of pages so its not optional. -Infobox recommendation is similar - it relies on the XML dump, and the corresponding dbpedia infobox data. If we have those we can start a kind of machine learning (actually done by lucene). To be able to display the infobox fill form with help, we also need certain xml files for infoboxes. -there is a co-occurence learning phase, it relies on XML dump and tf-idf, and is needed for book recommendation. But book search works without that. -Book recommendation is quite simple - you can use the english books, which are often referenced in non-english texts as well. To have non-english books we need library catalogs in some processable format. That can be an issue, I have not even found one for Hungarian yet. However, we could change this part and use library API's like Z39.50. There are always performance issues with those but I can see that sooner or later we need to support those. So to sum up, adding a new language is a piece of work right now and we need certain resources. However, we will try German and Hungarian in this year. We will try to simplify the process and will do our best in supporting more languages.  But we always gonna need some help form locals to the given country - library data and testing. I hope I answered your questions and you will become a happy user! Best Regards Mihály On 5 October 2011 16:56, Yury Katkov < > wrote:" "defining new class in dbpedia ontology" "uHi I've been trying to define a new class in the dbpedia ontology (Bank, subclassof Company) but there is a strange thing happening not covered by the documentation Even though my new class appears in the ontology view ( select distinct * { ?p ?o . } gives no results. Finally when clicking in the mapping wiki on Thanks a lot in advance Johannes Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. uHi Johannes, This should be fixed now, can you confirm? Dimtiris On Thu, Dec 4, 2014 at 3:15 PM, < > wrote: uHi Dimitris, Yes, it looks OK now (since a few days already). Did you correct my definition or was this due to a bug somewhere else? Thanks in any case! Johannes BTW I also modified the mapping for Infobox_company ( De : Dimitris Kontokostas [mailto: ] Envoyé : mardi 9 décembre 2014 14:50 À : HEINECKE Johannes IMT/OLPS Cc : Objet : Re: [Dbpedia-discussion] defining new class in dbpedia ontology Hi Johannes, This should be fixed now, can you confirm? Dimtiris On Thu, Dec 4, 2014 at 3:15 PM, < > wrote: Hi I’ve been trying to define a new class in the dbpedia ontology (Bank, subclassof Company) but there is a strange thing happening not covered by the documentation Even though my new class appears in the ontology view ( select distinct * { ?p ?o . } gives no results. Finally when clicking in the mapping wiki on Thanks a lot in advance Johannes Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. uOn Tue, Dec 9, 2014 at 4:01 PM, < > wrote: Neither :) The ontology updater script had stopped. It should catch all ontology updates now What you suggest in not supported but you could test individual pages as an alternative" "query dbpedia to find linked phrases" "uHi, I would like to ask if there is any way to query dbpedia (using SPARQL) in order to find out all the entities that are links to other wiki-articles within the wiki-article of a specific entity whose URI is known. For example I know the resource URI of the entity :\"President of the United States\" : in the corresponding wiki-article some of the first links to other articles are : P.O.T.U.S. (Sirius XM) ., head of state , head of government , United States . etc. Is there any way to get the above mentioned entities in any form (eg URI) ? Something like: etc. Is relevant information about links and backlinks stored in triplet form? Thank you!! Chryssa Hi, I would like to ask if there is any way to query dbpedia (using SPARQL) in order to find out all the entities that are links to other wiki-articles within the wiki-article of a specific entity whose URI is known. For example I know the resource URI of the entity :'President of the United States'  : Chryssa uHi Chryssa, On 01/13/2012 02:28 PM, Chryssa Zerva wrote: You can find the information you are looking for in the DBpedia dump titled \"Wikipedia Pagelinks\" which is available for download at [1]. uThanks for your answer but I still have problems. I actually need to built an application that will directly connect to the server of dbpedia without the need to download any dumps. But when I try to query dbpedia to get possible Wikipagelinks (that I have already check they exist in the dumps) I get no valid answers. For example the query: SELECT distinct ?Plinks WHERE { ?Plinks . } will return nothing, not even when I enter the query on the Virtuoso SPARQL Query Editor online. 2012/1/13 Mohamed Morsey < > uHi, On 01/16/2012 10:05 AM, Chryssa Zerva wrote: This is because the dump titled \"Wikipedia Pagelinks\" is not loaded to the official endpoint. So, I would suggest you to establish your own endpoint, i.e. you can download all DBpedia dumps and load them onto one of your machines and direct your queries against it. uI see, is there any possible way to know which dumps are loaded to the official endpoint? 2012/1/16 Mohamed Morsey < > uHi, you can find them at On 01/16/2012 10:23 AM, Chryssa Zerva wrote: uOn 1/16/2012 4:14 AM, Mohamed Morsey wrote: Why not? From my p.o.v. the pagelinks are one of the most interesting things in DBpedia" "Missing properties for a resource from live.dbpedia.org when compared to dbpedia.org" "uHi, Frequently, I am finding that the set of triples returned by live.dbpedia.org for a resource would be missing some properties, when compared to the triples returned by dbpedia.org. For example, compare vs The triples returned from live.dbpedia.org for Titanic do not have any of the ontology properties for director, writer, musicComposer etc, whereas the triples returned from dbpedia.org (using the above url) do have these properties for director, writer, musicComposer Is it expected that live.dbpedia.org wouldn't have all the properties for a resource, although they are available from dbpedia.org? ThanksArun uHi Arun, On 02/26/2013 05:46 AM, Arun Chippada wrote: There is an encoding issue here, and we will try to fix it ASAP. The data in DBpedia-Live is complete and if you use the following query, you will get the complete data about \"Titanic\": uThanks again Morsey. Greatly appreciate your help! The encoding issue explains several of the cases that I have seen. However, I saw other cases, where dbpedia ontology properties are missing in the triples returned from live.dbpedia, where it may not be due to the uri encoding issue. For example, vs In the triples returned from live.dbpedia.org, the properties - , and few other ontology properties are missing. These ontology properties are, however, returned from dbpedia.org. I am wondering whether the missing triples are not existing in the live.dbpedia triple store (or) if they are getting dropped during the request. Another example is ThanksArun Date: Tue, 26 Feb 2013 09:32:18 +0100 From: To: CC: Subject: Re: [Dbpedia-discussion] Missing properties for a resource from live.dbpedia.org when compared to dbpedia.org Hi Arun, On 02/26/2013 05:46 AM, Arun Chippada wrote: Hi, Frequently, I am finding that the set of triples returned by live.dbpedia.org for a resource would be missing some properties, when compared to the triples returned by dbpedia.org. For example, compare vs The triples returned from live.dbpedia.org for Titanic do not have any of the ontology properties for director, writer, musicComposer etc, whereas the triples returned from dbpedia.org (using the above url) do have these properties for director, writer, musicComposer There is an encoding issue here, and we will try to fix it ASAP. Is it expected that live.dbpedia.org wouldn't have all the properties for a resource, although they are available from dbpedia.org? The data in DBpedia-Live is complete and if you use the following query, you will get the complete data about \"Titanic\": Thanks Arun" "Danish persondata" "uHi Everybody I'm trying to find a way to extract all persons from the Danish version of dbpedia. When i look in persondata_en.csv.bz2 file. But in there is no Danish equivalent. Is it possible to find or make such a file? Best /Johs. W." "Dbpedia Live latest updates" "uHello, I was looking for recently published resources on Dbpedia Live but couldn't find them. I then checked the *last published file [1]* and see that the last update is dated Nov 12, 2014. The statistics [2] also do not show any sign of activity. Is Dbpedia Live still running? Thanks, David [1] [2] ► HelixWare online video platform ► WordLift semantic web for WordPress ► RedLink - making sense of your data ► US Export compliance extension for WooCommerce ══════════════════════════════════════════════ ► Twitter: @ziodave" "Mapping a template without key=value properties" "uWe're working on Italian DBpedia, trying to complete the mapping of the template \"Sportivo\" [1]. This template includes another template, \"Carriera Sportivo\" [2], which is used to track down the career steps of a soccer player. The problem with \"Carriera Sportivo\" is that this template doesn't seem to have the typical key=value format. We'd like to know if it's possible to map these kinds of template using properties' positions. It's better to describe it with an example of the mapping we want to introduce: {{IntermediateNodeMapping | nodeClass = CareerStation | correspondingProperty = careerStation | mappings = {{PropertyMapping | templateProperty = 1 | ontologyProperty = years }} {{PropertyMapping | templateProperty = 2 | ontologyProperty = club }} {{PropertyMapping | templateProperty = 3 | ontologyProperty = numberOfMatches }} }} {{IntermediateNodeMapping | nodeClass = CareerStation | correspondingProperty = careerStation | mappings = {{PropertyMapping | templateProperty = 4 | ontologyProperty = years }} {{PropertyMapping | templateProperty = 5 | ontologyProperty = club }} {{PropertyMapping | templateProperty = 6 | ontologyProperty = numberOfMatches }} }} {{IntermediateNodeMapping | nodeClass = CareerStation | correspondingProperty = careerStation | mappings = {{PropertyMapping | templateProperty = 7 | ontologyProperty = years }} {{PropertyMapping | templateProperty = 8 | ontologyProperty = club }} {{PropertyMapping | templateProperty = 9 | ontologyProperty = numberOfMatches }} }} Any hints to solve this problem? Simone, Federico, Marco Links: [1] [2] Template:Carriera_sportivo uHi Simone, Yes this is implemented. See Cheers, Dimitris On Tue, Jun 9, 2015 at 2:05 PM, Simone Papalini < > wrote: uHi Dimitris, thank you for the reply. I'm Marco and i work with Simone. I have just a question. Why this kind of templates (like \"See_also2\") are not included on [1]. Marco [1] 2015-06-10 10:04 GMT+02:00 Dimitris Kontokostas < >:" "Dbpedia and facets" "uHi all, As I mentioned in my email from 4th of march: > We will be working on: > (1) a better query interface for non expert users > (2) identification of classes withing the article categories > (3) application of Ontology learning for simple classification The first and highest priority point also includes an implementation of facet based browsing, which is quite far and as soon as we have either of the category-class mappings (YAGO or Ankes) will be very useful. I suppose Longwell probably won't work, since due to the amount of data there has to be a very tight coupling to the storage backend. Why don't we use the discussion mailinglist?" "BIND in DBpedia SPARQL endpoint" "uHi, I'm trying to create a new URI which I will use in a CONSTRUCT query in the end based on some data found on DBpedia: uHi, I'm trying to create a new URI which I will use in a CONSTRUCT query in the end based on some data found on DBpedia: uOn 04.06.12 14:33, Adrian Gschwend wrote: woops sorry for that, pressed some wrong key combination in Thunderbird which sent it out before I was done. That was the source: cu Adrian uHi Adrian, first of all sorry for the belated reply. please have a look on that thread On 06/04/2012 02:33 PM, Adrian Gschwend wrote:" "Content Negotiation for DBpedia" "uDear DBpedia list members, I successfully importet the DBpedia datasets into a Virtuoso installation. In the next step I would like to set up content negotiation following linked data principles ( very much obliged for any information on how this, e.g. * How are the *HTML- and RDF-representations* created, i.e. Thing: HTML: RDF: * How is the *content negotiation* (\"303 See Other\") from the Thing-URI to the appriate document-URI implemented? Is there any info out there on how to do this? Best wishes, Cord uHi Cord, Virtuoso has a built-in plugin now called \"dbpedia\". If you load the dbpedia datasets under \" box. For more specialized configuration we have forked it under [1] Best, Dimitris [1] On Tue, May 7, 2013 at 12:35 PM, Cord Wiljes < > wrote:" "Easier simple searches" "uI think that it needs to be easier to perform free text searches on DBpedia. Obviously, for complex semantic queries, the various SPARQL query interfaces provide powerful tools, but finding the URIs of interest in the first place remains challenging. One way to do it is to use one of the SPARQL UIs (I like ) and use a query phrase such as `?topic rdfs:label \"your topic here\"@en .` For this reason, I think it would be useful if that form defaulted to having that phrase in the body, instead of an ellipsis. This approach has some obvious problems, however. (For example, you need to be careful about the choice of or absence of the language tag, you need to get the label exactly correct, and you need to know a bit about SPARQL.) As a result, I think there should be a more sophisticated free text entry search box, which could use a SPARQL extension or at the other extreme could query Wikipedia directly and then rewrite the result URIs to the \"corresponding\" DBpedia pages. In any case, I think the chosen solution needs to be prominently included on at least the DBpedia home page, and likely from *every* DBpedia page. I think this should be as simple as a \"Search:\" input (such as the one that Wikipedia provides on each page). Thank you for your consideration. Respectfully, John L. Clark uHi John, I agree with you that there's a need for a simple search interface. I had built such an UI with DBpedia Search [1]. The prototype is unfortunately not working at the moment due to backend problems, but its functionality is briefly explained in our DBpedia ISWC paper [2]. The DBpedia dataset can be used to create a end-user friendly Wikipedia search, as its strong semantics could potentially provide much better results than the current key-text based search engines. But I think at the moment it would be more important to have a URI finder for data publishers. There was a discussion-thread [3] on the Linking-Open-Data list about how such a service could look like. Could be describe your use-case for a search functionality in more detail? Who is the user (end-user or data publisher)? What is he looking for? Cheers, Georgi [1] [2] [3] 22478" "set up Virtuoso to host DBpedia 3.4 locally" "uHi, I have some questions on setting up Virtuoso to host DBpedia 3.4 locally (without Amazon EC2 AMI). I have a running fresh local install of VOS 6.0.0 on linux, but am otherwise new to Virtuoso and DBpedia. I followed the instructions posted in earlier threads, please point me to the relevant discussion/documentation if I am missing something. I started from an old thread [1] on setting up DBpedia locally. The thread refers to the dbpedia_load bundle [2] for DBpedia 3.2 which includes the dbpedia_dav.vad package (version Nov 2008). According to a newer thread [3], the bundle should have been updated for DBpedia 3.4 (Nov 11 2009) but it seems like it was not (see below). Googling for the package, I found the Virtuoso EC2 DBpedia documentation page [4], which provides a download link [5] to a Mar 2009 version of the vad package (but not the bundle). Is [5] compatible with DBpedia 3.4? Is there an updated version of the dbpedia_load bundle? The outdated dbpedia_load bundle [2] checks for opencyc, umbel, and yago resource files that are available also from [6]. Which files should be loaded along with DBpedia 3.4 data? Maybe latest versions (see below)? Thank you in advance, Illes Solt [1] [2] [3] [4] [5] [6] Current versions of dbpedia_dav.vad files known to me: $ wget -q | tar -zxvf - ./dbpedia_dav.vad -O | grep -aE 'Release Date| $ wget -q -O - | grep -aE 'Release Date| Resource files required by [2]: umbel_class_hierarchy_v071.n3 umbel_abstract_concepts.n3 umbel_subject_concepts.n3 umbel_external_ontologies_linkage.n3 opencyc-2008-06-10.owl opencyc-2008-06-10-readable.owl yago-class-hierarchy_en.nt Guessed latest versions: http://www.umbel.org/ontology/umbel.n3 http://www.umbel.org/ontology/umbel_abstract_concepts.n3 http://www.umbel.org/ontology/umbel_subject_concepts.n3 http://www.umbel.org/ontology/umbel_external_ontologies_linkage.n3 http://sw.opencyc.org/downloads/opencyc_owl_downloads_v2/opencyc-2009-04-07.owl.gz http://sw.opencyc.org/downloads/opencyc_owl_downloads_v2/opencyc-2009-04-07-readable.owl.gz uHi Illes, Hmmm, the dbpedia_load.tar.gz script was updated on S3 in November 09, with what should have been the updated dbpedia_dav.vad package, but for some reason it was not it appears. I have updated it again and confirmed the updated dbpedia vad is included. Note the dbpedia_dav.vad package can also be downloaded directly from: Having said that whilst the dbpedia_load.tar.gz, which was the script used for loading the Dbpedia 3.2 datasets into a Virtuoso 5.x instance, and can be used for uploading the DBpedia 3.4 data sets, we used an updated method for uploaded the DBpedia 3.4 datasets in the the Virtuoso 6.x instance the current online version uses. Details on this new Virtuoso Bulk Loader method can be obtained at: The VirtEC2AMIDBpediaInstall/dbpedia_dav.vad file is correct for that document which is for loaing the DBpedia 3.2 datasets into a Virtuoso 5.x instance. I am not sure which opencyc, umbel, and yago resource files are loaded in the DBpedia 3.4 instance, and will need to confirm with development, but would imagine the latest should be used Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 7 Dec 2009, at 01:23, Solt, Illés wrote: uHugh Williams wrote: Hugh, The script above plus current DBpedia 3.4 datasets (which should include source files for OpenCyc, UMBEL, Yago, and SUMO; if these are missing we have a synchronization issue between ourselves and the public DBpedia data sets). Kingsley uKingsley, On 7 Dec 2009, at 12:16, Kingsley Idehen wrote: [Hugh] I have added a link in the documentation to the public DBpedia 3.4 datasets download page ( Regards Hugh" "a label issue" "uDear All, yet another small issue: has rdfs:label \"Даосизм\"@ru (which is translated as \"Taoism\" and quite strange) I hope, there is no WP page with this information. So, how could that happen? Best, Vladimir Ivanov Dear All, yet another small issue: Ivanov uHi Vladimir, we are aware of this and are currently fixing this and update the data as soon as possible. Cheers, Anja On Wed, 14 Apr 2010 16:03:07 +0400, Vladimir Ivanov < > wrote:" "MySociety's MaPit service: provides containing areas for any geotagged UK subject" "uHello - this is my first post here. MySociety's MaPit service now provides open data for any point in the UK, from its coordinates (or postcode), returning the containing administrative areas such as ward and constituency. So, if a Wikipedia article about any UK subject is geotagged, the relevant containing areas (and thus authorities) for the location can be determined. For example see Electric Cinema, Birmingham: then select the coordinates (shown by default in the top-right), and on the page of links to mapping services: the MaPit link: is the final item under the \"Great Britain\" heading. Dropping the \".html\" suffix returns JSON: Either tells us that the cinema is in Ladywood ward of Birmingham City Council, Birmingham Ladywood constituency, and so on. Can this data be used/ referenced by DBPedia? uHi Andy, if there are dereferencable URIs for the areas, you could link them to the corresponding DBpedia concept. For example area 2514 ( come in handy for this linking job. And of course it would also be nice if you find a way to share the regions data in RDF. Cheers, Max [1] On Sat, Oct 9, 2010 at 2:05 PM, Andy Mabbett < > wrote: uI can't, because I don't work no the project. The home page has contact details, but I will forward your message to my contact there. Thank you. On 12 October 2010 16:01, Max Jakob < > wrote: uOn re-reading your mail, you seem to be suggesting that MaPit should link to DBPedia. I'm suggesting that DBPedia could extract data from MaPit, or perhaps link to MaPit. MaPit is open source, should you want to do more with it. On 12 October 2010 16:01, Max Jakob < > wrote:" "Empty type for DBpedia Spotlight annotation" "uHello, When annotating, I got sometimes empty type for some entities. How to do to get the type when the returned type is empty? This is the endpoints I am using: with confidence=0.2 and support=0 Any suggestion is welcome. Regards Olivier Hello, When annotating, I got sometimes empty type for some entities. How to do to get the type when the returned type is empty? This is the endpoints I am using: - Olivier" "From instance to class" "uHello everyone, i start begging pardon if it's not the right place to ask this anyway: I'm developing a software that navigates dbpedia instances and returns the node that satisfies certain requirements. Those requirements incude the \"topological\" distance between the actual node and another specified node. As you surely understand this distance is really intensive to compute, so i'm searching for new ways to get something similar but less expensive. I thought i could use the DBpedia ontology and to do something like this: 1 map the actual node to his dbpedia ontology class; 2 map the specified node to his class; 3 calculate the distance between the classes in the ontology (wich is smaller than the set of instances). I can do this because i don't need the full path between the nodes but just a value that represents the distance (in this case i think i can call it \"semantic distance\"). Now, i don't know where to start from for doing this (my software is in java). Wich dump should i use? Anyone knows reliable opensource libraries for managing owl (i never used it before so i'm really new to it)? Is there some kind of limitation i'm not aware of that can stop me doing what i described?I think i should use the \"ontology types\" dataset for the mappings If someone can clarify everything for me i would be really grateful. Regards, Piero uHello Piero, Piero Molino schrieb: There are several ways to achieve this depending on how exactly you measure distance between classes. One way would be to use SPARQL and query the official endpoint, e.g. using Jena [1]. The second way would be to use the OWL API [2]. What to do specifically, depends on your distance metric. For instance, you could ask yourself whether two classes A1 and A2 are similar in your scenario if A1 is a super class of A2.(?) A simple way would be to query parent classes of A1 until a class A' is found, which is also parent of A2. You then get a path from A1 to A2 with A' as middle element and can measure its length. Due to the existence of owl:Thing such a path always exists. Google comes up with a few papers with more sophisticated approaches related to measuring distance in ontologies [3,4,5], which might be helpful. In your description, you assume that there is one class for each object. In general, an object can be instance of several classes. In particular, it can also belong to several \"most specific\" classes. However, this does seem to be rare in the DBpedia ontology (and you can generalise the above description to this case). Kind regards, Jens [1] [2] [3] [4] [5] freeabs_all.jsp?tp=&arnumber;=4444245&isnumber;=4444190 uHi Jens, Il giorno 15/lug/09, alle ore 17:04, Jens Lehmann ha scritto: Ok thankyou, the two libraires you're suggesting me are an extremely good starting point. This remembers me of leacock chodorow measure used in my research lab for calculating semantic distance in wordnet. The fact that there's a class wich is a kind of root is a good thing for this. There's something alse like that i should know? Or can you even suggest me something like a tool for visualizing the ontology and became aware of his characteristics? (in a university course we used protege for building example ontologies, could it be useful?) It's really funny that the second paper you're suggesting me has been done by researchers in the same laboratory of the same university i'm actually working in :) so i thank you for your suggestion and i will probably go ask them some suggestion about distance metrics. Ok i get it. Now for example let's take: (my home town). the rdf:type property (wich i'm assuming is the one useful for the maping) gives back: rdf:type dbpedia-owl:Place dbpedia-owl:Area dbpedia-owl:Resource dbpedia-owl:PopulatedPlace Googling yago i've found it's an ontology based on wordnet structer (more or less). By the way as you told me the classes are one more specific than another. Is there a way to determinate how \"deep\" a class is other than calculating a path to owl:Thing ? I'm asking this because right now i'm thinking of mapping an instance to one class, maybe the most specific one, by te way i may find come other ways like map to every class and than take the deepesti don't know i will have to think a bit more about this :) Thank you again. Regards, Piero uHello, Piero Molino schrieb: [] Yes, Protégé can be useful. If you open the DBpedia ontology in Protege, go to the OWLViz tab, then select \"Options\" => \"Radius 5\", you get an overview of all classes. If you meet Francesca Lisi or Nicola Fanizzi, send them my regards. :-) DBpedia has different class hierarchies (DBpedia ontology, YAGO, OpenCyc, Umbel), which you should not mix in your approach. See Section 3.2 in our latest DBpedia paper [2] for an overview. The DBpedia ontology has the prefix Since we currently store all types of an entity (Place, Area, PopulatedPlace) for an entity and not just the most specific one (PopulatedPlace), you could also calculate the depth by just counting the number of classes. This works if there is a single most specific class and we keep storing all more general classes in the SPARQL endpoint (which might change in the future). Kind regards, Jens uHi Jens, sorry for resuming this old discussion but I'm working on this right now (when I started the discussion it was in an evaluation state). Il giorno 17/lug/2009, alle ore 08.22, Jens Lehmann ha scritto: Analyzing the DBpedia ontology I've found that it is somehow \"lightweight\" in respect of the task i want to do on it. To calculate a semantic distance maybe YAGO fits better. Also it is based on the wordnet structure witch has been used in many projects as a resource for calculation of semantic distance. By the way the idea you gave me about calculating the distance counting the classes works in the DBpedia ontology because every class listed is a subclass of the other, but as i am observing from the example i posted before, in the YAGO ontlogy it is not like that. I would ask this question on a YAGO mailing list but i can't find any reference on this on their website so i a m sking here, i hope not o be out of contest. Another question: ad i want to have a \"general\" approach, i am finding some instances that are somehow different from the others as they don't have a rdf:type. For example: Assuming i'm calculating the semantic distance from owl:thing caounting the classes in rdf:type, this node would behave differently, so how could i do it in this cases? Thank you in advice. Regards, Piero" "Diffrence between URI, URIref and namespace URI?" "uWhat's the exact difference between a URI, a URIref and a namespace URI? Thank you! What's the exact difference between a URI, a URIref and a namespace URI? Thank you! u\"URI reference\" was another name for IRI. An IRI is basically a URI that may contain non-ASCII characters. See the (outdated) RDF 1.0 spec: If I'm not mistaken, the term \"URI reference\" was used in RDF 1.0 because the RFC for IRIs was not finished yet. RDF 1.1 only uses the term \"IRI\". A namespace URI - or more precisely namespace IRI - is the prefix shared by a certain set of RDF IRIs, e.g. \" \"Namespace IRIs and namespace prefixes are not a formal part of the RDF data model. They are merely a syntactic convenience for abbreviating IRIs.\" Hope that helps. JC On 1 May 2014 21:49, Luciane Monteiro < > wrote: uAs I said, an IRI is basically a URI that may contain non-ASCII characters. Also, every valid URI is also a valid IRI, but not vice versa. URI, because \"ü\" is not an ASCII character. The corresponding URI is Note that in RDF, these two identifiers aren't equivalent - they do not identify the same resource. On 2 May 2014 17:44, Luciane Monteiro < > wrote:" "working with dbpedia" "uHi all, I wish to send a sparql query to dbpedia and get some results. Unfortunately, I cant actually figure out how to do this. What is the best way to do it? I have tried to send the query in an http request and read the http response. However, I did not manage to succeed. Is there any tutorial for \"amateurs\" that would show you how to query dbpedia via sparql endpoints? thanks a lot uHi Savio, A simple way to ask SPARQL queries against DBpedia is to use the SNORQL query builder which sends SPARQL queries over HTTP to the DBpedia SPARQL endpoint. Just click on the link below for asking an example query (does not work with Internet Explorer, because of some javascript bugs) Cheers Chris" "24 Million Broken Pagelinks" "uI'm looking at the dbpeda 3.2 dump and noticed another odd triple in the pagelinks . The oddness is that as a resource or a redirect. The wikipedia entry that corresponds to !!! looks just fine and the link to \"bassline\" goes directly to the entry for Bassline, which corresponds to This isn't an isolated case: the object side of about 24 M pagelink triples fail to resolve, in contrast to 46M triples that do. That means about 1/3 of the triples are bad." "Missing file at" "uThanks for the heads up and the mail to noc. Please let us know what they say. We'll have to find a new language list file. JC On Apr 11, 2013 2:02 PM, \"Omri Oren\" < > wrote: uThis one looks good, I think, but there may be some problems or changes. I can't check right now. On Apr 11, 2013 2:40 PM, \"Jona Christopher Sahnwaldt\" < > wrote: uOn Thu, Apr 11, 2013 at 8:40 AM, Jona Christopher Sahnwaldt < Are you after languages or wikis? You might be able to use one of the other files in that directory such as Tom uOn Apr 11, 2013 2:02 PM, \"Omri Oren\" < > wrote: Wikimedia has removed the file generate-settings relies on. (it contains a list of wikipedia languages) Omri, could you try to adapt the code to use don't know what the old format was and if the new file may contain fewer or more languages. When you succeed, please send a pull request so we can merge your contributions into the main branch: (I'm using the \"dump\" branch of the extraction framework. Is it still the most updated branch to work with? How do I find out in case this changes?) The master branch is most current. To find out if that changes, looking at the latest commits at from Google), Could you send a copy to the list? What was the format? \"en\" or \"enwiki\"? JC file back where it was." "OWL DatatypeProperty URLs" "uHi, Please could someone clarify if the following is inconsistent or explain how it should work? Looking in the file dbpedia_3.9.owl I see this: age ?????? With the DatatypeProperty defined as: However when I look at the datasets the 'age' tuples have a predicate of: This seems to be the case for some of the other properties in the OWL file too. I assume these URIs should match? Although I'm not an expert on OWL/RDF. Thanks, Tim. Hi, Please could someone clarify if the following is inconsistent or explain how it should work? Looking in the file dbpedia_3.9.owl I see this: age ηλικία ' | sed 's/ newuris.tmp cat newuris.tmp | sed 's/Location: \/view\/en\//http:\/\/rdf.freebase.com\/rdf\/en./' But then I realized, that I didn't have the original uris to replace them in the file. How where the Freebase links created in the first place? Does anyone have a script or an idea? Regards, Sebastian Hellmann uOn Thu, Jan 27, 2011 at 2:59 PM, Sebastian Hellmann < > wrote: That link looks correct to me. Are you setting your HTTP accept headers to say you want RDF? If you do, it should redirect you to Tom uAh, ok thank you. curl -I -L -H \"Accept: application/rdf+xml\" works of course. It was a conceptual error. As RDF, I thought I would get redirected to the data resource in RDF again and not the HTML representation. Thanks for the fast help, Sebastian On 27.01.2011 21:11, Tom Morris wrote: uOn Thu, Jan 27, 2011 at 3:25 PM, Sebastian Hellmann < > wrote: I don't know who made the decision to link to both sides of the argument. The /ns/ form is more useful if it escapes into a non-RDF context, but it does require the extra step of setting of the necessary HTTP headers for RDF consumers. Tom" "Feedback needed (Was: GSOC 2016: ß-Testing of lookup and Completion of the Tasks)" "uThis is great news Kunal, Congrats and public kudos to Tim Ermilov, Axel Ngonga and Sandro Coelho for the mentoring! @Everyone this is a project where we need the community feedback Besides the enhancements & I18n improvements Kunal worked on, a major focus of this project was the update of the API trying to match the Google Knowledge Graph API [1] that can be returned with accept: application/json+ld from the for ß-testing server here . We have two important question before we merge Kunal's code 1) See en example response at the end of the email and tell us how close to the GKG return format/schema do you want us to go (a) leave it as is (similar) (b) one-to-one match with different terms (c) match the terms but change the JSON-LD context (d) match the exact schema? (We also think that there is no API copyright issue here based on the recent Google-Oracle dispute but other views are welcome) 2) should we enable the json-ld format by default? [1] Example response: { \"@context\": { \"@description\": \" \"@refCount\": \" \"@templates\": \" \"@redirects\": \" }, \"results\": [ { \"uri\": \" \"label\": \"Arthur Space\", \"description\": \"Charles Arthur Space (October 12, 1908 – January 13, 1983) was an American film, television and stage actor. He was best known as Doc Weaver, the veterinarian, in thirty-nine episodes of long-running CBS television series, Lassie.\", \"refCount\": 69, \"classes\": [ { \"uri\": \" \"label\": \"person\" } ], \"categories\": [ { \"uri\": \" \"label\": \"Cancer deaths in California\" }, { \"uri\": \" http://dbpedia.org/resource/Category:American_male_film_actors\", \"label\": \"American male film actors\" }, { \"uri\": \" http://dbpedia.org/resource/Category:American_male_television_actors\", \"label\": \"American male television actors\" }, { \"uri\": \"http://dbpedia.org/resource/Category:1983_deaths\", \"label\": \"1983 deaths\" }, { \"uri\": \" http://dbpedia.org/resource/Category:Male_actors_from_New_Jersey\", \"label\": \"Male actors from New Jersey\" }, { \"uri\": \" http://dbpedia.org/resource/Category:American_male_stage_actors\", \"label\": \"American male stage actors\" }, { \"uri\": \" http://dbpedia.org/resource/Category:People_from_New_Brunswick,_New_Jersey\", \"label\": \"People from New Brunswick, New Jersey\" }, { \"uri\": \"http://dbpedia.org/resource/Category:1908_births\", \"label\": \"1908 births\" }, { \"uri\": \" http://dbpedia.org/resource/Category:20th-century_American_male_actors\", \"label\": \"20th-century American male actors\" } ], \"templates\": [], \"redirects\": [] }, { \"uri\": \"http://dbpedia.org/resource/Invader_(artist)\", \"label\": \"Invader (artist)\", \"description\": \"Invader is the pseudonym of a well-known French urban artist, born in 1969, whose work is modelled on the crude pixellation of 1970s-1980s 8-bit video games. He took his name from the 1978 arcade game Space Invaders, and much of his work is composed of square ceramic tiles inspired by video game characters. Although he prefers to remain incognito, and guards his identity carefully, his distinctive creations can be seen in many highly-visible locations in more than 60 cities in 30 countries.\", \"refCount\": 3, \"classes\": [ { \"uri\": \"http://www.wikidata.org/entity/Q215627\", \"label\": \"http://www.wikidata.org/entity/ q215627\" }, { \"uri\": \"http://www.w3.org/2002/07/owl#Thing\", \"label\": \"owl#Thing\" }, { \"uri\": \"http://dbpedia.org/ontology/Person\", \"label\": \"person\" }, { \"uri\": \" http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#Agent\", \"label\": \"http://www.ontologydesignpatterns.org/ont/dul/ d u l.owl# agent\" }, { \"uri\": \" http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#NaturalPerson\", \"label\": \"http://www.ontologydesignpatterns.org/ont/dul/ d u l.owl# natural person\" }, { \"uri\": \"http://schema.org/Person\", \"label\": \"person\" }, { \"uri\": \"http://www.wikidata.org/entity/Q5\", \"label\": \"http://www.wikidata.org/entity/ q5\" }, { \"uri\": \"http://xmlns.com/foaf/0.1/Person\", \"label\": \"http://xmlns.com/foaf/0.1/ person\" }, { \"uri\": \"http://dbpedia.org/ontology/Agent\", \"label\": \"agent\" } ], \"categories\": [ { \"uri\": \"http://dbpedia.org/resource/Category:Living_people\", \"label\": \"Living people\" }, { \"uri\": \"http://dbpedia.org/resource/Category:1969_births\", \"label\": \"1969 births\" }, { \"uri\": \" http://dbpedia.org/resource/Category:French_graffiti_artists\", \"label\": \"French graffiti artists\" } ], \"templates\": [], \"redirects\": [] }, { \"uri\": \"http://dbpedia.org/resource/Zack_Space\", \"label\": \"Zack Space\", \"description\": \"Zachary T. \\"Zack\\" Space (born January 27, 1961) is an American politician and the former U.S. Representative for Ohio's 18th congressional district, serving from 2007 until 2011. He is a member of the Democratic Party. He currently serves as a principal for Vorys Advisors LLC a subsidiary of the law firm Vorys, Sater, Seymour and Pease.\", \"refCount\": 28, \"classes\": [ { \"uri\": \"http://www.wikidata.org/entity/Q215627\", \"label\": \"http://www.wikidata.org/entity/ q215627\" }, { \"uri\": \"http://www.w3.org/2002/07/owl#Thing\", \"label\": \"owl#Thing\" }, { \"uri\": \"http://dbpedia.org/ontology/Person\", \"label\": \"person\" }, { \"uri\": \" http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#Agent\", \"label\": \"http://www.ontologydesignpatterns.org/ont/dul/ d u l.owl# agent\" }, { \"uri\": \" http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#NaturalPerson\", \"label\": \"http://www.ontologydesignpatterns.org/ont/dul/ d u l.owl# natural person\" }, { \"uri\": \"http://schema.org/Person\", \"label\": \"person\" }, { \"uri\": \"http://www.wikidata.org/entity/Q5\", \"label\": \"http://www.wikidata.org/entity/ q5\" }, { \"uri\": \"http://xmlns.com/foaf/0.1/Person\", \"label\": \"http://xmlns.com/foaf/0.1/ person\" }, { \"uri\": \"http://dbpedia.org/ontology/Agent\", \"label\": \"agent\" } ], \"categories\": [ { \"uri\": \"http://dbpedia.org/resource/Category:1961_births\", \"label\": \"1961 births\" }, { \"uri\": \" http://dbpedia.org/resource/Category:Members_of_the_United_States_House_of_Representatives_from_Ohio \", \"label\": \"Members of the United States House of Representatives from Ohio\" }, { \"uri\": \"http://dbpedia.org/resource/Category:Ohio_Democrats\", \"label\": \"Ohio Democrats\" }, { \"uri\": \"http://dbpedia.org/resource/Category:Living_people\", \"label\": \"Living people\" }, { \"uri\": \" http://dbpedia.org/resource/Category:People_from_Dover,_Ohio\", \"label\": \"People from Dover, Ohio\" }, { \"uri\": \" http://dbpedia.org/resource/Category:Moritz_College_of_Law_alumni\", \"label\": \"Moritz College of Law alumni\" }, { \"uri\": \" http://dbpedia.org/resource/Category:American_people_of_Greek_descent\", \"label\": \"American people of Greek descent\" }, { \"uri\": \"http://dbpedia.org/resource/Category:Kenyon_College_alumni \", \"label\": \"Kenyon College alumni\" }, { \"uri\": \" http://dbpedia.org/resource/Category:Democratic_Party_members_of_the_United_States_House_of_Representatives \", \"label\": \"Democratic Party members of the United States House of Representatives\" } ], \"templates\": [], \"redirects\": [] }, { \"uri\": \"http://dbpedia.org/resource/Parker_Space\", \"label\": \"Parker Space\", \"description\": \"Parker Space (born December 4, 1968) is an American Republican Party politician, who has served in the New Jersey General Assembly since 2013, where he represents the 24th Legislative District and serves on the Assembly's Agriculture and Natural Resources and Labor committees.Space is a farmer and restaurant owner who also owns Space Farms Zoo and Museum in the Beemerville section of Wantage Township in Sussex County, New Jersey.\", \"refCount\": 9, \"classes\": [ { \"uri\": \"http://dbpedia.org/ontology/Person\", \"label\": \"person\" } ], \"categories\": [ { \"uri\": \" http://dbpedia.org/resource/Category:Members_of_the_New_Jersey_General_Assembly \", \"label\": \"Members of the New Jersey General Assembly\" }, { \"uri\": \" http://dbpedia.org/resource/Category:County_freeholders_in_New_Jersey\", \"label\": \"County freeholders in New Jersey\" }, { \"uri\": \"http://dbpedia.org/resource/Category:Living_people\", \"label\": \"Living people\" }, { \"uri\": \"http://dbpedia.org/resource/Category:1968_births\", \"label\": \"1968 births\" }, { \"uri\": \" http://dbpedia.org/resource/Category:New_Jersey_Republicans\", \"label\": \"New Jersey Republicans\" }, { \"uri\": \" http://dbpedia.org/resource/Category:Mayors_of_places_in_New_Jersey\", \"label\": \"Mayors of places in New Jersey\" }, { \"uri\": \" http://dbpedia.org/resource/Category:People_from_Sussex_County,_New_Jersey\", \"label\": \"People from Sussex County, New Jersey\" } ], \"templates\": [], \"redirects\": [] }, { \"uri\": \"http://dbpedia.org/resource/Olli_Wisdom\", \"label\": \"Olli Wisdom\", \"description\": \"Olli Wisdom is a British musician, currently residing in London.Wisdom is significant for his role as the singer in the glam/gothic rock group Specimen, but since the 1990s he has recorded as a psychedelic trance artist under the name Space Tribe.Prior to forming Specimen and opening the Batcave nightclub, Wisdom was the frontman for the punk band, The Unwanted. Their most popular song was a cover version of Nancy Sinatra's \\"These Boots Are Made For Walkin'\\".\", \"refCount\": 16, \"classes\": [ { \"uri\": \"http://www.wikidata.org/entity/Q215627\", \"label\": \"http://www.wikidata.org/entity/ q215627\" }, { \"uri\": \"http://www.w3.org/2002/07/owl#Thing\", \"label\": \"owl#Thing\" }, { \"uri\": \"http://dbpedia.org/ontology/Person\", \"label\": \"person\" }, { \"uri\": \" http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#Agent\", \"label\": \"http://www.ontologydesignpatterns.org/ont/dul/ d u l.owl# agent\" }, { \"uri\": \" http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#NaturalPerson\", \"label\": \"http://www.ontologydesignpatterns.org/ont/dul/ d u l.owl# natural person\" }, { \"uri\": \"http://schema.org/Person\", \"label\": \"person\" }, { \"uri\": \"http://www.wikidata.org/entity/Q5\", \"label\": \"http://www.wikidata.org/entity/ q5\" }, { \"uri\": \"http://xmlns.com/foaf/0.1/Person\", \"label\": \"http://xmlns.com/foaf/0.1/ person\" }, { \"uri\": \"http://dbpedia.org/ontology/Agent\", \"label\": \"agent\" } ], \"categories\": [ { \"uri\": \" http://dbpedia.org/resource/Category:Psychedelic_trance_musicians\", \"label\": \"Psychedelic trance musicians\" }, { \"uri\": \" http://dbpedia.org/resource/Category:English_punk_rock_singers\", \"label\": \"English punk rock singers\" }, { \"uri\": \"http://dbpedia.org/resource/Category:Living_people\", \"label\": \"Living people\" }, { \"uri\": \" http://dbpedia.org/resource/Category:Year_of_birth_missing_(living_people)\", \"label\": \"Year of birth missing (living people)\" }, { \"uri\": \"http://dbpedia.org/resource/Category:Singers_from_London\", \"label\": \"Singers from London\" }, { \"uri\": \"http://dbpedia.org/resource/Category:English_male_singers \", \"label\": \"English male singers\" }, { \"uri\": \" http://dbpedia.org/resource/Category:Place_of_birth_missing_(living_people) \", \"label\": \"Place of birth missing (living people)\" }, { \"uri\": \" http://dbpedia.org/resource/Category:English_trance_musicians\", \"label\": \"English trance musicians\" }, { \"uri\": \"http://dbpedia.org/resource/Category:Gothic_rock_musicians \", \"label\": \"Gothic rock musicians\" } ], \"templates\": [], \"redirects\": [] } ] } On Fri, Jul 22, 2016 at 5:29 PM, Kunal Jha < > wrote:" "DBpedia Archive now part of Amazon Public Data Sets" "uAll, We now have the first of many data sets from the LOD community uploaded to the Amazon's public data set hosting facility. Other data sets from the LOD cloud will follow. Thus you can now do the following: 1. Download DBpedia data from Amazon's hosting facility at no cost to your own data center and then build your own personal or service specific edition of DBpedia 2. Download to an EC2 AMI and build yourself using Virtuoso or any other Quad / Triple Store 2. Use the DBpedia EC2 AMI which we provide (which will produce a rendition in 1.5 hrs) I would especially like to thank our colleagues and new Linked Data supporters at both Amazon Web Services (Jeff & his wonderful team) and (Joe Kelly and his colleagues) for their assistance re. getting this very taxing process in motion. This is a major step forward for the Linked Data Web effort! Please remember that: remains the staging area for data sets. This is the focal point for our infochimps.org colleagues re. their assembly and deployment work en route to AWS hosting. Links: 1. 2. VirtEC2AMIDBpediaInstall" "DBpedia Lookup PrefixSearch - Up again" "uHi all, the service is running again since yesterday afternoon. Happy querying, Chris Von: Pablo N. Mendes [mailto: ] Gesendet: Montag, 31. März 2014 19:01 An: Neubert Joachim Cc: Volha Bryl; dbpedia-discussion Betreff: Re: [Dbpedia-discussion] DBpedia Lookup PrefixSearch still down Hi folks, Thanks for your heads up on the service being down. :-) The service is kindly maintained as a voluntary effort, for free, by the fine folks at U. Mannheim. The source code is also shared for free on Github for anybody that would like to run their own instance. Heiko Paulheim, Volha Bryl and Chris Bizer may be able to help you to get in touch with the sysadmin that keeps this up for us. While we're at it, let's make sure to thank them for keeping this service available for the community! Cheers Pablo On Mar 31, 2014 1:40 AM, \"Neubert Joachim\" < > wrote: Does somebody know where the service runs, and whom to contact for it? Cheers, Joachim" "Extracting Causality from DBPedia?" "uHi, I would like to extract causality from DBPedia For example I want to perform the following SPARQL query: select ?x ?y where {?x caused ?y} I know this is not in dbpedia but what do I need to create it? What are the missing pieces? Do I have to do some sort of NLP on the abstracts or wikipedia itself? For example will this work: Linking to Wordnet and seeing if a word has wordnet_type or is a subtype of it and then identifying the NPs before and after as cause and effect? I really dont want to do NLP as this is not my final goalWhat is the easiest way to get what causes what in the dbpedia corpus? Your input is much appreciated! Marv uGuys, I am posting again Hi, I would like to extract causality from DBPedia For example I want to perform the following SPARQL query: select ?x ?y where {?x caused ?y} I know this is not in dbpedia but what do I need to create it? What are the missing pieces? Do I have to do some sort of NLP on the abstracts or wikipedia itself? For example will this work: Linking to Wordnet and seeing if a word has wordnet_type or is a subtype of it and then identifying the NPs before and after as cause and effect? I really dont want to do NLP as this is not my final goalWhat is the easiest way to get what causes what in the dbpedia corpus? Your input is much appreciated! Marv uHi Marvin, can you give an concrete example? I don't really understand Georgi" "Fact Ranking Quiz" "u[Apologies for cross-posting. Please redistribute within your own group or among colleagues, thank you!] Dear all, We have all are experiencing the exponential increase of knowledge available on the web. Dealing with such an extensive amount of information poses a challenge when trying to identify what is potentially important and what not. Of course this always lies in the eye of the beholder. But, we are also interested in information that might be relevant for the majority of us, the so-called mainstream. We have developed a Fact Ranking Quiz to try to find out what fact seems to be generally important for a given resource. By using the wisdom of the crowd, we aim to develop a first ground-truth corpus that will help the scientific community to evaluate various new fact ranking strategies. Our new UI has an improved user experience, such that now you are able to understand and judge the facts more easily, having a much more transparent scoring. We are committed to gathering as much input as possible, in order to draw objective conclusions about what might be generally important (for a given concept). Therefore, your help and contributions are greatly appreciated! Our tool [1] allows you to interactively rank facts about ~500 popular entities taken from Wikipedia. You may interrupt the quiz at any time you like and continue later. The longer you play and the more feedback you provide, the more points you will earn! At every step you are shown how you rank among other players and how many points you can earn. However, in case you are not familiar with one of the presented facts, you can always vote \"I don't know\". There is no right or wrong answer. Just vote as you think it seems right for you. Our goal is to aggregate votes from all the participants and determine a general (mainstream) relevance of the presented facts. The facts are automatically generated from DBpedia triples. Therefore, please don't mind the (sometimes) odd formulation. On the other hand, also DBpedia contains wrong facts. If you come across something you know is wrong, please vote for \"nonsense\" and help us also to cleanup DBpedia a little bit. Please check out the new version of our quiz and help us by putting in your knowledge and point of view! Your inputs are highly appreciated and will greatly contribute to our scientific experiment and the creation of a first fact ranking corpus. (The registration is very simple and takes 1 min, after which you will be taken to a page where the task is explained in more detail.) Please do also spread the word. The more participants, the more valid our ground truth will be. Of course, we are planning to release all of the gathered data into the public for further (scientific) use. [1] Fact Ranking Web-Application, Thanks and best regards, Semantic Technologies Team u[Apologies for cross-posting. Please redistribute within your own group or among colleagues, thank you!] Dear all, We have all are experiencing the exponential increase of knowledge available on the web. Dealing with such an extensive amount of information poses a challenge when trying to identify what is potentially important and what not. Of course this always lies in the eye of the beholder. But, we are also interested in information that might be relevant for the majority of us, the so-called mainstream. We have developed a Fact Ranking Quiz to try to find out what fact seems to be generally important for a given resource. By using the wisdom of the crowd, we aim to develop a first ground-truth corpus that will help the scientific community to evaluate various new fact ranking strategies. Our new UI has an improved user experience, such that now you are able to understand and judge the facts more easily, having a much more transparent scoring. We are committed to gathering as much input as possible, in order to draw objective conclusions about what might be generally important (for a given concept). Therefore, your help and contributions are greatly appreciated! Our tool [1] allows you to interactively rank facts about ~500 popular entities taken from Wikipedia. You may interrupt the quiz at any time you like and continue later. The longer you play and the more feedback you provide, the more points you will earn! At every step you are shown how you rank among other players and how many points you can earn. However, in case you are not familiar with one of the presented facts, you can always vote \"I don't know\". There is no right or wrong answer. Just vote as you think it seems right for you. Our goal is to aggregate votes from all the participants and determine a general (mainstream) relevance of the presented facts. The facts are automatically generated from DBpedia triples. Therefore, please don't mind the (sometimes) odd formulation. On the other hand, also DBpedia contains wrong facts. If you come across something you know is wrong, please vote for \"nonsense\" and help us also to cleanup DBpedia a little bit. Please check out the new version of our quiz and help us by putting in your knowledge and point of view! Your inputs are highly appreciated and will greatly contribute to our scientific experiment and the creation of a first fact ranking corpus. (The registration is very simple and takes 1 min, after which you will be taken to a page where the task is explained in more detail.) Please do also spread the word. The more participants, the more valid our ground truth will be. Of course, we are planning to release all of the gathered data into the public for further (scientific) use. [1] Fact Ranking Web-Application, Thanks and best regards, Semantic Technologies Team" "DBpedia Live questions" "uHi I have a couple of questions about DBpedia Live : 1. Is it still Live. Last changesets seem to be on August 18th, and there was a big gap before that. Prior to this there were numerous changesets per day ? 2. What is the project status - is it still actively maintained - (am considering using it on a fairly high profile project) 3. What version of DBpedia ontology does DBpedia Live conform too, and any plans for aligning with dbpedia 2014 ontology ? many thanks for any help Paul Wilton datalanguage uHi Paul, On Sep 12, 2014 5:31 PM, \"Paul Wilton\" < > wrote: The database got corrupted twice in a short period and OpenLink is working on a fix to track the source of the error. ATM the endpoint is up but the changesets are stalled as you noticed 2. What is the project status - is it still actively maintained - (am Yes it is, if fact we plan to re-design part of the existing architecture to make more scalable It keeps the latest version of the ontology according to the mappings wiki. This is also stalled atm. there is an open bug here that will be fixed as well: uThanks Dimitris, Do you a timescale for when the Stalling issues will be fixed,and maybe a roadmap for the re-design you might be able to share ? Is it possible to publish regular n-triples changesets to the DBpedia 2014 release ? thanks Paul On Fri, Sep 12, 2014 at 3:43 PM, Dimitris Kontokostas < > wrote: uOn Fri, Sep 12, 2014 at 6:31 PM, Paul Wilton < > wrote: OpenLink can answer this one Is it possible to publish regular n-triples changesets to the DBpedia 2014 At the moment the Live extraction is coupled with a VOS instance for publishing changes immediately to the main Live endpoint so we cannot start publishing changesets without fixing this issue Part of our roadmap is the decoupling of changeset generation process with the triple store update (for similar cases) and adding additional services besides a SPARQL endpoint (i.e. S3 access) Best, Dimitris uOpenLink - do you have a timescale for the DBpedia Live fixes? Ideally, I would like to be able to move to DBpedia 2014 now + the new onotogy, and then apply changesets to this once Dbpedia LIve is backyou think this will be supported ? (ie changesets diffed from the 2014 dbpedia dump) thanks Paul On Mon, Sep 15, 2014 at 8:21 AM, Dimitris Kontokostas < > wrote: uOn Thu, Sep 18, 2014 at 12:58 PM, Paul Wilton < > wrote: No, this is not supported. Live has a continuous update stream while the 2014 dataset is static and based on wikipedia dumps from May 2014 uyep - but given Live is not Live anymoreyou will need to baseline changesets from somewhere once it comes back it seems DBpedia2014 dumps would be a good place to start ? or at least provide them ? As a consumer adopting DBpedia 2014, any applications built upon it (like we are doing at the BBC) will become progressively stale. Wouldn't it make sense to provide Live changesets to this going forwards ? It would certainly encourage uptake, and make for a much more useful offering, and allow consumers to build much more useful applications ? On Thu, Sep 18, 2014 at 12:00 PM, Dimitris Kontokostas < > wrote: uOn 9/18/14 7:17 AM, Paul Wilton wrote: I am investigating this matter internally. The DBpedia-Live effort has been coordination challenged for a while now. I or someone from OpenLink will get back to everyone re., this matter, from our perspective. uMany thanks for the update Patrick. If I need to setup a new dbpedia Live instance, If I were to take the DBpedia 2014 dump (from May), presumably I can then apply the changesets from the date of that dump , then process your changesets when they are back up from August until now. Can you point me at the timestamp of the first changeset I should use to apply to the DBpedia 2014 dumps ? best Paul www.datalanguage.com On Fri, Sep 19, 2014 at 4:22 PM, Patrick van Kleef < > wrote: uAll, Actually the live.dbpedia.org/sparql or the dbpedia-live.openlinksw.com/sparql endpoints already serve newer data than the DBPedia 2014 dumps, since the latter were made from an extraction of Wikipedia from May, whereas the Live stream was working up to mid August. We are currently together with Dimitris to resolve the outstanding problems and we hope to resume the service early next week. We will reset the feeder to continue the Wikipedia stream from where it left of in August or slightly earlier so everyone processing the changesets will catch up as quickly as possible. As the live feeder is also patched into the Mapping Wiki, it already has the same ontology as the DBpedia 2014 database. Patrick uOn Fri, Sep 19, 2014 at 6:32 PM, Paul Wilton < > wrote: The static dump is not compatible with DBpedia Live. I am preparing a dump from our cache db right now and will send an email with instructions on how to setup. Everything should be ready by early next week. Best, Dimitris" "Date comparism error" "uhi, inspired by this page [1] i tried to sparql for people who managed to die on their birthday. with this query PREFIX xsd: SELECT * WHERE { ?s ?bd FILTER(datatype(?bd)=xsd:date). ?s ?dd FILTER(datatype(?dd)=xsd:date). FILTER(MONTH(?bd)=MONTH(?dd) && DAY(?bd)=DAY(?dd)) . } LIMIT 10 i get this error [2] at dbpedia's sparqlEndpoint, i also tried different birthDate properties that are available but the result stays the same. when i look at a resource's page it says the all these dates are of type xsd:date (example : any pointer appreciatedwkr turnguard [1] [2] Virtuoso 22007 Error DT001: Function month needs a datetime, date or time as argument 1, not an arg of type rdf (246) uHi, On 09/15/2012 08:02 PM, Jürgen Jakobitsch wrote: the following one should work: PREFIX xsd: SELECT ?s ?bd ?dd WHERE { ?s ?bd FILTER(datatype(?bd)=xsd:date). ?s ?dd FILTER(datatype(?dd)=xsd:date). FILTER (bif:substring(STR(?bd), 6, 2) = bif:substring(STR(?dd), 6, 2) && bif:substring(STR(?bd), 9, 2) = bif:substring(STR(?dd), 9, 4) ). } LIMIT 100" "questions about DBpedia ontology" "uHi, I have a question about the DBpedia's ontology: 1. looks like DBpedia has its own high level ontology, which is newly created, and has about 170 classes defined, and about 900 properties; 2. also, DBpedia also uses the SKOS vocabulary, YAGO classification system. How do these two related to the ontology above? 3. how can I see one example of the RDF statements generated by parsing one specific infobox? that will give me a lot better understanding I am not even sure my questions are making sense, but I do look forward to any help/thoughts you can provide. thanks! uOn 28 Jan 2009, at 18:46, l Yu wrote: For example, here's an infobox: It's easy to obtain the corresponding DBpedia URI, just replace the prefix: Open that in your browser. Look for everything involving the \"dbpedia- owl:xxx\" prefix. As a rule of thumb, the lines with property \"dbpedia- owl:xxx\" come from the Berlin infobox, while lines with \"is dbpedia- owl:xxx of\" come from infoboxes on other pages that link to Berlin. I'm not sure how the rdf:type statements involving dbpedia-owl:Xxx types are generated, Georgi could fill in the details here. To see the raw RDF data used to render the page, click the box in the top right corner. Note that DBpedia is refreshed only every few months, so the Wikipedia article might have been updated more recently. We are working on shortening the update cycle. Best, Richard uThanks! this is very helpful. And regarding my question 1 and 2, I am still looking for answers. Let us make it more clear, maybe that will help. So basically we have this new ontology (dbpedia-ontology.owl), which has about 170 classes and 900 properties, and the parsing of the infobox is done by (mainly) using this ontology. And to understand the exact mapping from infobox to this ontology, using tennis player Roger Federer as an example (given Aussie Open is going on), I did the following query: PREFIX (skip) SELECT ?property ?hasValue WHERE { ?property ?hasValue. } which shows me how the parse has been done for the infobox on the page for Roger Federer: all the property that has ever been used on resource dbpedia-ontology.owl ontology is used when the parsing is being conducted, yet, there are indeed other vocabularies involved: 1. YAGO mainly used as the subject of rdf:type. for example: dbpedia:Roger_Federer rdf:type dbpedia:class/yago/LivingPeople 2. SKOS: mainly used as follows: dbpedia:Roger_Federer skos:subject :Category:1981_births So the question is: Seems to me that dbpedia-ontology.owl is good enough to do the trick (at this point), why do we need SKOS and YAGO? I don't see much reason. Could you please shed some light on this one? Thanks, yu On Wed, Jan 28, 2009 at 6:11 PM, Richard Cyganiak < > wrote: uOn 29 Jan 2009, at 15:53, l Yu wrote: > Seems to me that dbpedia-ontology.owl is good enough to do the trick DBpedia includes different datasets obtained from Wikipedia using different extraction and data cleansing approaches. Users of DBpedia can choose which part of the datasets work best for them. The three datasets you mention all have different strengths: The SKOS stuff directly corresponds to Wikipedia categories, the YAGO stuff is much more comprehensive than dbpedia-owl while still being closer to a “true” class hierarchy than the SKOS categories, and dbpedia-owl is very high quality and contains properties, not just a class hierarchy. Choice is a good thing, and if you don't need any of these bits for your task, then just don't use them. Best, Richard uThank you very much Richard!! what you wrote does answer my question well: the machine-readable information is there in the dataset, and it is up to the agent to chose which one to use. Thanks again! On Thu, Jan 29, 2009 at 12:34 PM, Richard Cyganiak < > wrote: uHi, More questions on the DBpedia ontology. Again, using the same example: Wikipedia page for Roger Federer (so sorry to see him lost in the Aussie Open final!!!): so the extraction algorithm will parse the infobox on Roger_Federer page, and obviously, one of the generated RDF statements has to specify the fact that Roger_Federer as a resource (whose URI is given by class: My question is, how does the parser knows this? in other words, how does the parser knows the current page is about a tennis player? Could it because it can recognize the infobox template is an Athlete template? This part is very confusing too me. I will appreciate it a lot if Richard or someone could kindly provide an answer? Thanks! On Thu, Jan 29, 2009 at 1:22 PM, l Yu < > wrote: uAFAIK your guess is correct, the extractor assigns the TennisPlayer class because the Wikipedia article uses an infobox template that has been manually classified as a tennis player template. Best, Richard On 3 Feb 2009, at 02:51, l Yu wrote: uexactly, class statements for the dbpedia ontology base on manual mappings of infoboxes to classes. See Cheers, Georgi uHi, Thank you guys for answering my questions, I have learned a lot, and meanwhile, I have more questions. Back to Roger Federer, he is a resource of type , and this class is defined in DBpedia ontology. This ontology also defines a property, called whose rdfs:domain is Now, is a sub-class of , so property can be used to describe . However, if you look at the RDF file generated for Roger Federer, you find he following line: 1981-08-08 WHY IS THIS? What is this the property: ? I am really struggling with this, please help me if you have some idea! thanks! yu On Tue, Feb 3, 2009 at 9:30 AM, Richard Cyganiak < > wrote: uHi, Sorry I kept sending questions here, but I was reading the publications about DBpedia, and seems like there is a query builder which can really help a lot. but this link, which also seems to be the best link I can find, does not really take me anywhere. I was wondering if this \"DBpedia Query Builder\" is in production, or is just some experiment system that we cannot access from outside the lab? Would really appreciate it if someone can help! yu On Thu, Feb 5, 2009 at 11:21 PM, l Yu < > wrote: uOn 2/7/09 12:07 AM, l Yu wrote: Yu, Depending on what you want you can also try the Query Builder (which is a Generic SPARQL Query Builder) at: Guide: (*ignore some of the page aesthetics as revamps are in progress*). Kingsley uKingsley, Thanks! I will definitely check these links out, I took a very quick look, and they are all very helpful! DBpedia is a very interesting and impressive effort, yet one big problem is that first-time users will find it difficult to even get started, one has to understand RDF, understand SPARQL, and how does ontology fits into the whole picture and everything, it is indeed quick a learning curve there. It would be extremely nice, if more intuitive tutorials can be available, which will make efforts like DBpedia much more acceptable and more reachable to a much wider audiences. Anyway, thanks again! yu On Sat, Feb 7, 2009 at 11:24 AM, Kingsley Idehen < > wrote: uAbsolutely. We need a better website, better documentation and tutorials, better tools for first-time users. Volunteers highly welcome! uGeorgi, Glad to catch you - can you shed some light on my previous question (in this email thread) about the properties? without using properties defined in DBpedia ontology, the extractor seems to like properties defined elsewhere (see my previous email), can you say something about that? On Sat, Feb 7, 2009 at 11:38 AM, Georgi Kobilarov < > wrote: uOn 2/7/09 11:33 AM, l Yu wrote: Yu, The way to solve these problems re. the Web is for people like you to list: 1. What you want to do 2. Suggest how you would like it to be done. The challenge on the Web is that the audience is \"Open World\" based. Thus, it very hard to make generic UI entry points. What we can do though, is get feedback and then respond quickly. Certainly DBpedia needs a few updates and guides for \"first time users\", and Ideally this should be done via the DBpedia Wiki pages at: Anyway, nice feedback from you already, you've brought \"Query\", \"Search\", \"Find\", and \"Auto Complete\" back under the radar :-) Kingsley uHello, there are two different infobox extractors in DBpedia: - one parsers *all* properties of *all* infoboxes in a generic fashion, turning every infobox property into a rdf-predicate in the namespace dbpedia.org/property. - the other (mapping based) infobox extractor uses the dbpedia ontology and predicates in the namespace dbpedia.org/ontology. this data is supposed to be better structured, but only a subset of infoboxes and infobox properties is covered yet. Since we are still running both extractors and load both resulting datasets into our sparql endpoint, you'll see much \"duplicate\" data on dbpedia pages. Best, Georgi uOn 2/7/09 11:40 AM, l Yu wrote: Yu, There are a number of ontologies (Yago & OpenCyc in addition to DBpedia's) associated with DBpedia, you even have an ontology bridge in the form of UMBEL. Thus, services that provide \"Search\", \"Find\", \"Query\", \"Auto Complete\" for DBpedia, from different service providers (which is great), are going to vary. For instance, our services (as per faceted browser and service) include all the ontologies rather that one (so \"properties\" or \"properties by value\" leverage all of the ontologies). btw - we will add an \"Auto Complete\" feature to the Faceted Browser UI (even though we kinda assumed others would have done this based on the Web Service Interface). Kingsley uKingsley, I agree with you - the ideal place is DBpeida's wiki page, there should be information that can guide the first-time users. I am not saying it is not good, but regarding the query build I was asking, you can find this line in the DBpeida's wiki page: The Leipzig query builder at but if you click it, it goes to a Page Loader Error with time out information. Also, at the place where some tutorial should be collected, instead, a list of academic publications are listed, I actually read all of them, but I think it would be better if more entry-level materials are there. Anyway, thank you and others for your help, I learned a lot indeed. yu On Sat, Feb 7, 2009 at 11:46 AM, Kingsley Idehen < > wrote: uI see!! Okay, the extractor using dbpedia ontology is much better, in my opinion. the properties it uses are from a formal ontology written in OWL, and it has rdfs:domain and rdfs:range nicely defined, and I can use ontology's reasoning ability if I code up my own user agent, all sort of nice things can happen. On the other hand, the original extractor, which turns every infobox property into some property (in the namespace dbpedia.org/property/) is not very attractive to me, and nothing I can do about them. For Mr. Roger Federer (and other tennis players), one such property is call rd1team, it took me forever to realize this means \"first round\" something, but this property is not defined anywhere, and I have no idea about it domain and range When do you plan to get rid of the original extractor, so all properties from infobox will be mapped nicely to the properties defined in DBpedia ontology? Thanks!!! and sorry for keep bothering you guys! On Sat, Feb 7, 2009 at 11:50 AM, Georgi Kobilarov < > wrote: uWe need to reach nearly-full coverage of mappings between Wikipedia infoboxes and the DBpedia ontology before we can think about making that dataset the main one. The issue here is simple: Somebody has define all these mappings. We made a first shot (i.e. the data you see in the ontology right now, many thanks to Anja Jentsch for here great work here), but we're still too far away from full coverage. My idea is to ask the DBpedia community to maintain the ontology and all mappings, but I haven't had the time to implement the according user interface yet. And the UI is not going to be trivial: It has to be easy enough to browse and maintain the ontology (which will - I think -contain hundreds of classes and over thousand properties in its end version), easy to create/maintain the mappings of infoboxes, and it will need to have a user role model and versioning, so that no-one will be able to \"break\" things. No rocket-science, but also still not something to implement on a weekend unfortunately. It will come, but I can't promise when. Cheers, Georgi uThank you Georgi for the explanation - and the whole idea is quite clear to us now. I agree with you: this detailed mapping has to be created and maintained somewhat manually, and once we have a new mapping done, we can run the extractor to get better RDF graphs. More specifically, two types of mappings have to be maintained: 1. class-level mapping each \"infobox template\" has to be mapped to some class defined in DBpedia ontology; this is the *relatively* easy part; 2. property-level mapping this is the one that requires patience and tons of work: map each property from a given infobox template to the properties of its corresponding class defined in the first mapping. Obviously, lots of work needs to be done. Once you are done with the GUI, I will be happy to be one of the person maintaining/creating the mapping! thanks, yu On Sat, Feb 7, 2009 at 12:21 PM, Georgi Kobilarov < > wrote: uOn 2/7/09 11:56 AM, l Yu wrote: Yes, and that will be fixed. There is a new version in development. That said, the page should have been fixed :-) The Wiki should be public editable, so you can make some changes, I believe :-) Kingsley" "What's going wrong with the Openlink LOD SPARQL endopoint?" "uHi all, try several times this simple query on (default dataset name: SELECT COUNT(DISTINCT ?movie) WHERE { ?movie a dbpedia-owl:Film. ?movie rdfs:label ?label. FILTER ( lang(?label) = \"en\" ) } You will get each time a different number. :-) Now, if you try the same query on the \"correct\" number (at least it does not change!): 60222 If you remove the filter condition, you will get the same result. ps: Kingsley, I know the answer is yours! :-) cheers, roberto uHi Roberto, Did you notice that on field Execution Timeout is set to 15000 ? This enables the ANYTIME[1] query option in Virtuoso, which ensures that Virtuoso will return with whatever records it has found in 15000 msec, or about 15 seconds of processing time. Depending on other queries running, this amount will be different between successive runs explaining the difference you are seeing. If you retry your query, but set the field to 0 or empty, then you get similar behaviour to Note that on both machines there are (different) maximum timeouts set, to make sure everyone gets an even share of the resources available on these services. See also: [1] Patrick uIl 11/2/2011 1:43 PM, Patrick van Kleef ha scritto: Great! Thank you very much Patrick! I was sure that someone from OpenLink Software would have answered to me. :-) Just to take advantage of you, is there a way to setting this parameter for similar queries programmatically, e.g., via Java code? At the moment with Java I use Jena and the sparqlService: Query query = QueryFactory.create(\"SELECT \"); QueryExecution qexec = QueryExecutionFactory.sparqlService(\" query, \" try { ResultSet results = qexec.execSelect(); while (results.hasNext()) { } } finally { qexec.close(); }" "A New HTTP Status Code for Legally-restricted Resources" "uOn 6/12/12 11:17 AM, Melvin Carvalho wrote: Nice find! Also cc'd in LOD and DBpedia lists. From time to time, we get requests to remove entries from DBpedia and DBpedia-Live. Typically, if the page is deleted from Wikipedia it gets deleted automatically from the live editions and manually via the static edition. That said, I envisage a time when there's a deadlock re. Wikipedia content and a legal \"cease and desist\" notice sent to DBpedia" "Downtime for mappings.dbpedia.org server" "uHi, When I trying to access the links - , got the message \"Service Temporarily Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.\" Is there a scheduled downtime for accessing the mapping server? Thanks and regards, Venkatesh Hi, When I trying to access the links - Venkatesh uOn Fri, Oct 12, 2012 at 12:44 PM, Venkatesh Channal < > wrote: Should work now. Please let us know if you encounter any problems. There is no scheduled downtime or service level target. We basically try to stay up as much as possible. :-) We recently moved some servers, that's why some were not always reachable, but everything should be back to normal now. Regards, JC" "DBpedia page on Wikipedia" "uHi all, several people have pointed out that it is strange that DBpedia doesn't have a page on Wikipedia, which is of course true. Therefore I have started a new page at: Please feel free to add stuff that I have missed. Cheers Chris uChris Bizer wrote: Really strange :-) It's amazing the things that sometimes get overlooked :-) Kingsley" "DBpedia-based RDF dumps for Wikidata" "uDear all, TL;DR; We are working on an *experimental* Wikidata RDF export based on DBpedia and would like some feedback on our future directions. Disclaimer: this work is not related or affiliated with the official Wikidata RDF dumps. Our current approach is to use Wikidata like all other Wikipedia editions and apply our extractors to each Wikidata page (item). This approach generates triples in the DBpedia domain ( duplication, since Wikidata already provides RDF, we made some different design choices and map wikidata data directly into the DBpedia ontology. sample data: experimental dump: (errors see below) *Wikidata mapping details* In the same way we use mappings.dbpedia.org to define mappings from Wikipedia templates to the DBpedia ontology, we define transformation mappings from Wikidata properties to RDF triples in the DBpedia ontology. At the moment we provide two types of Wikidata property mappings: a) through the mappings wiki in the form of equivalent classes or properties e.g. property: Class: which will result in the following triples: wd:Qx a dbo:Person wd:Qx dbo:birthDate “” b) transformation mappings that are (for now) defined in a json file [1]. At the moment we provide the following mappings options: - Predefined values - \"P625\": {\"rdf:type\":\" will result in: wd:Qx a geo:SpatialThing - Value formatting with a string containing $1 - \"P214\": {\"owl:sameAs\": \" will result in: wd:Qx owl:sameAs - Value formatting with predefined functions. The following are supported for now - $getDBpediaClass: returns the equivalent DBpedia class for a Wikidata item (using the mappings wiki) - $getLatitude, $getLongitude & $getGeoRss: geo-related functions Also note that we can define multiple mappings per property to get the Wikidata data closer to the DBpedia RDF exports e.g.: \"P625\": [ {\"rdf:type\":\"http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing\"}, {\"geo:lat\":\"$getLatitude\"}, {\"geo:long\": \"$getLongitude\"}, {\"georss:point\":\"$getGeoRss\"}], \"P18\": [ {\"thumbnail\":\" http://commons.wikimedia.org/wiki/Special:FilePath/$1?width=300\"}, {\"foaf:depiction\":\"http://commons.wikimedia.org/wiki/Special:FilePath/$1\"}], *Qualifiers & reification* Like Wikidata we provide a simplified dump without qualifiers and a reified dump with qualifiers. However, for the reification we chose simple RDF reification in order to reuse the DBpedia ontology as much as possible. The reified dumps are also mapped using the same configuration. *Labels, descriptions, aliases and interwiki links* We additionally defined extractors to get data other than statements. For textual data we split the dumps to the languages that are enabled in the mappings wiki and all the rest. We extract aliases, labels, descriptions, site links. For interwiki links we provide links between Wikidata and DBpedia as well as links between different DBpedia language editions. *Properties* We also fully extract wikidata property pages. However, for now we don’t apply any mappings to wikidata properties. *DBpedia extractors* Some existing DBpedia extractors also apply in Wikidata that provide versioning and provenance (e.g. pageID, revisionID, etc) *Help & Feedback* Although this is a work in progress we wanted to announce it early and get you feedback on the following: - Are we going in the right direction? - Did we overlook something or is something missing? - Are there any other mapping options we should include? - Where should we host the advanced json mappings? - One option is in the mappings wiki, another one is in Wikidata directly or a separate github project It would be great if you could help us map more data. The easiest way is through the mappings wiki where you can define equivalent classes & properties. See what is missing here: http://mappings.dbpedia.org/server/ontology/wikidata/missing/ You can also provide json configuration but until the code is merged it will not be easy with PRs. Until the code is merged in the main DBpedia repo you can check it out from here: https://github.com/alismayilov/extraction-framework/tree/wikidataAllCommits Notes: - we use the Wikidata-Toolkit for reading the json structure which is a great project btw - The full dump we provide is not complete due to a Wikidata dump export bug. The compressed files are not closed correctly due to this. Best, Ali Ismayilov, Dimitris Kontokostas, Sören Auer [1] https://github.com/alismayilov/extraction-framework/blob/wikidataAllCommits/dump/config.json uDimitris, Soren, and DBpedia team, That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences: On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas < > wrote: What, from your point of view, is the practical consequence of these different design choices? How do the end results manifest themselves to the consumers? Tom Dimitris, Soren, and DBpedia team, That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences: On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas < > wrote: we made some different design choices and map wikidata data directly into the DBpedia ontology. What, from your point of view, is the practical consequence of these different design choices?  How do the end results manifest themselves to the consumers? Tom uDear Tom, let me try to answer this question in a more general way. In the future, we honestly consider to map all data on the web to the DBpedia ontology (extending it where it makes sense). We hope that this will enable you to query many data sets on the Web using the same queries. As a convenience measure, we will get a huge download server that provides all data from a single point in consistent formats and consistent metadata, classified by the DBpedia Ontology. Wikidata is just one example, there is also commons, Wiktionary (hopefully via DBnary), data from companies, DBpedia members and EU projects. all the best, Sebastian On 11.03.2015 06:11, Tom Morris wrote: uThis is a very ambitious, but commendable, goal. To map all data on the web to the DBpedia ontology is a huge undertaking that will take many years of effort. However, if it can be accomplished the potential payoff is also huge and could result in the realization of a true Semantic Web. Just as with any very large and complex software development effort, there needs to be a structured approach to achieving the desired results. That structured approach probably involves a clear requirements analysis and resulting requirements documentation. It also requires a design document and an implementation document, as well as risk assessment and risk mitigation. While there is no bigger believer in the \"build a little, test a little\" rapid prototyping approach to development, I don't think that is appropriate for a project of this size and complexity. Also, the size and complexity also suggest the final product will likely be beyond the scope of any individual to fully comprehend the overall ontological structure. Therefore, a reasonable approach might be to break the effort into smaller, comprehensible segments. Since this is a large ontology development effort, segmenting the ontology into domains of interest and creating working groups to focus on each domain might be a workable approach. There would also need to be a working group that focus on the top levels of the ontology and monitors the domain working groups to ensure overall compatibility and reduce the likelihood of duplicate or overlapping concepts in the upper levels of the ontology and treats universal concepts such as space and time consistently. There also needs to be a clear, and hopefully simple, approach to mapping data on the web to the DBpedia ontology that will accommodate both large data developers and web site developers. It would be wonderful to see the worldwide web community get behind such an initiative and make rapid progress in realizing this commendable goal. However, just as special interests defeated the goal of having a universal software development approach (Ada), I fear the same sorts of special interests will likely result in a continuation of the current myriad development efforts. I understand the \"one size doesn't fit all\" arguments, but I also think \"one size could fit a whole lot\" could be the case here. Respectfully, John Flynn From: Sebastian Hellmann [mailto: ] Sent: Wednesday, March 11, 2015 3:12 AM To: Tom Morris; Dimitris Kontokostas Cc: Wikidata Discussion List; dbpedia-ontology; ; DBpedia-Developers Subject: Re: [Dbpedia-discussion] [Dbpedia-developers] DBpedia-based RDF dumps for Wikidata Dear Tom, let me try to answer this question in a more general way. In the future, we honestly consider to map all data on the web to the DBpedia ontology (extending it where it makes sense). We hope that this will enable you to query many data sets on the Web using the same queries. As a convenience measure, we will get a huge download server that provides all data from a single point in consistent formats and consistent metadata, classified by the DBpedia Ontology. Wikidata is just one example, there is also commons, Wiktionary (hopefully via DBnary), data from companies, DBpedia members and EU projects. all the best, Sebastian On 11.03.2015 06:11, Tom Morris wrote: Dimitris, Soren, and DBpedia team, That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences: On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas < > wrote: we made some different design choices and map wikidata data directly into the DBpedia ontology. What, from your point of view, is the practical consequence of these different design choices? How do the end results manifest themselves to the consumers? Tom uSebastian, Thanks very much for the explanation. It was a single missing word, \"ontology,\" which led me astray. If the opening sentence had said \"based on the DBpedia ontology,\" I probably would have figured it out. Your amplification of the underlying motivation helps me better understand what's driving this though. I guess I had naively abandoned critical thinking and assumed DBpedia was dead now that we had WikiData without thinking about how the two could evolve / compete / cooperate / thrive. Good luck! Best regards, Tom On Wed, Mar 11, 2015 at 4:29 PM, Sebastian Hellmann < > wrote: uDimitris Kontokostas> we made some different design choices and map wikidata data directly into the DBpedia ontology. I’m very interested in this. A simple example: bgwiki started keeping Place Hierarchy in Wikidata because it’s much less efficient to keep it in deeply nested subtemplates. This made it very hard for bgdbpedia to extract this info, because how do you mix e.g. dbo:partOf and wd:Pnnn? So this is a logical continuation of the first step, which was for DBpedia to source inter-language links (owl:sameAs) from WD. (I haven’t tracked the list in a while, could someone give me a link to such dump? Sorry) Tom Morris> abandoned critical thinking and assumed DBpedia was dead now that we had WikiData That’s quite false. Both have their strengths and weaknesses. - DBpedia has much more info than Wikidata. For Chrissake, Wikidata doesn’t even have category>article assignments! - Wikidata has more entities (true, \"stubs\"), a lot of them created for coreferencing (authority control) purposes. IMHO there’s a bit of a revolution in this domain, check out VIAF is moving to Wikidata coreferencing, which will get them double the name forms, 300k orgs and 700k persons. This is Big Deal to any library of museum hack. - Wikidata has easier access to labels. In DBpedia you have to do a wikiPageRedirects dance, and if you’re naïve you’ll assume “God Does Not Play Dice” is another name for Einstein - Right now IMHO Wikidata has better direct types for persons. This is a shame and we need to fix it in DBpedia We exactly have to think about this. Last few months I've been worrying that the two communities don't much talk to each other. We as humanity should leverage the strengths of both, to gain maximum benefits. I've become active in both communities, and I feel no shame in such split loyalies. :-) I went to DBpedia Dublin, now'll go to GlamWiki Hague It's structured data each way! Some little incoherencies: - The DBpedia Extraction framework is a very ingenious thing, and the devs working on it are very smart. But can they compete with a thousand Wikidata bot writers? (plus Magnus who in my mind holds semi-God status) - Meanwhile, DBpedia can't muster a willing Wikipedia hack to service mappings.dbpedia.org, which is stuck in the stone age. - Wikidatians move around tons of data every day, but their understanding of RDF *as a community* is still a bit naïve. - DBpedia holds tons of structured data, but Wikidata seems to plan to source it by individual bot contributionsmaybe in full in 5 years time? - DBpedia has grokked the black magic of dealing with hand-written *multilingual* units and conversions. Of a gazillion units Last I looked, Wikidata folks shrugged this off with \"too different from our data types\" - A Little Cooperation Goes a Long Way. Cheers! u0€ *†H†÷  €0€1 0 + uDear all, Following up on the early prototype we announced earlier [1] we are happy to announce a consolidated Wikidata RDF dump based on DBpedia. (Disclaimer: this work is not related or affiliated with the official Wikidata RDF dumps) We provide: * sample data for preview * a complete dump with over 1 Billion triples: * a SPARQL endpoint: * a Linked Data interface: Using the wikidata dump from March we were able to retrieve more that 1B triples, 8.5M typed things according to the DBpedia ontology along with 48M transitive types, 6.4M coordinates and 1.5M depictions. A complete report for this effort can be found here: The extraction code is now fully integrated in the DBpedia Information Extraction Framework. We are eagerly waiting for your feedback and your help in improving the DBpedia to Wikidata mapping coverage Best, Ali Ismayilov, Dimitris Kontokostas, Sören Auer, Jens Lehmann, Sebastian Hellmann [1] msg06936.html uOn Wed, May 20, 2015 at 10:34 AM, Gerard Meijssen < Hi Gerard, This dataset originates directly from Wikidata so it's only statements that exist in Wikidata atm. With the mapping process we map Wikidata properties & classes to the DBpedia ontology and transform the statements accordingly. For instance wdt:Q42 wkd:P31 Q5 ; wkd:P569 \"11-03-1951\"^^xsd:date. will be transformed to dbw:Q42 rdf:type dbo:Person; # plus all transitive types from the DBpedia ontology dbo:birthDate \"11-03-1951\"^^xsd:date. If your point is to include missing Wikidata statements from DBpedia, this can be a step towards towards this goal since now it is easy to compare the differences. Best, Dimitris uThank you Hugh, This is definitely an area where we need further feedback from the community. Most of these links are DBpedia language links. The majority of the DBpedia links are not dereferencable and are based on the different DBpedia language editions provided as RDF dumps only. In this release we decided to provide them all for completeness but we are very open to other suggestions. Best, Dimitris On May 26, 2015 18:20, \"Hugh Glaser\" < > wrote:" "SPARQL queries: how to optimize a search by literals?" "uHi all, I need to query DBpedia to get the genre and release date of all the songs that got picked in the BBC Radio 4 Desert Island Discs show. I'm using the Jena API, in Java, to send all the SPARQL queries to the songs and the name of the artist, I need to query by these strings. To restrict the search, I've added the constraints on the resource type. If I don't put a type constraint, the query times out. Here is one of the queries I use for getting the song URI, genre and release date: SELECT DISTINCT * WHERE { { ?song rdf:type . ?song rdfs:label ?songTitle . OPTIONAL {?song dbpedia-owl:genre ?genre} . OPTIONAL {?song dbpedia2:genre ?genre} . OPTIONAL {?song dbpedia-owl:releaseDate ?releaseDate} . FILTER ( regex(?songTitle, \"Fallen Soldier\", \"i\") ) . FILTER (LANG(?songTitle) = 'en') . } UNION { ?song rdf:type . ?song rdfs:label ?songTitle . OPTIONAL {?song dbpedia-owl:genre ?genre} . OPTIONAL {?song dbpedia2:genre ?genre} . OPTIONAL {?song dbpedia-owl:releaseDate ?releaseDate} . FILTER ( regex(?songTitle, \"Fallen Soldier\", \"i\") ) . FILTER (LANG(?songTitle) = 'en') . } UNION { ?song rdf:type . ?song rdfs:label ?songTitle . OPTIONAL {?song dbpedia-owl:genre ?genre} . OPTIONAL {?song dbpedia2:genre ?genre} . OPTIONAL {?song dbpedia-owl:releaseDate ?releaseDate} . FILTER ( regex(?songTitle, \"Fallen Soldier\", \"i\") ) . FILTER (LANG(?songTitle) = 'en') . } UNION { ?song rdf:type . ?song rdfs:label ?songTitle . OPTIONAL {?song dbpedia-owl:genre ?genre} . OPTIONAL {?song dbpedia2:genre ?genre} . OPTIONAL {?song dbpedia-owl:releaseDate ?releaseDate} . FILTER ( regex(?songTitle, \"Fallen Soldier\", \"i\") ) . FILTER (LANG(?songTitle) = 'en') . } } LIMIT 1 I've also tried to split the query into two: first to get the song URI by title, and then get the genre and release date (which is a bit faster than this one, but still very slow; sometimes it times out as it is). My question is, is there any way to optimize this search by literals ? It takes up to 5 minutes per song to return results and I need to handle about 14,000 of them. Thanks in advance! Regards, Alina Elena Băluşescu MSc student in Advanced Computer Science department The University of Manchester Hi all, I need to query DBpedia to get the genre and release date of all the songs that got picked in the BBC Radio 4 Desert Island Discs show. I'm using the Jena API, in Java, to send all the SPARQL queries to the Manchester uHi Alina, On 07/17/2012 06:48 PM, Alina Balusescu wrote: Actually, regex is an extremely expensive operation, try to use bif:contains instead. So your query should like the following: SELECT DISTINCT * WHERE { { ?song rdf:type . ?song rdfs:label ?songTitle . OPTIONAL {?song dbpedia-owl:genre ?genre} . OPTIONAL {?song dbpprop:genre ?genre} . OPTIONAL {?song dbpedia-owl:releaseDate ?releaseDate} . FILTER ( bif:contains( ?songTitle, 'Fallen and Soldier')) . FILTER (LANG(?songTitle) = 'en') . } UNION { ?song rdf:type . ?song rdfs:label ?songTitle . OPTIONAL {?song dbpedia-owl:genre ?genre} . OPTIONAL {?song dbpprop:genre ?genre} . OPTIONAL {?song dbpedia-owl:releaseDate ?releaseDate} . FILTER ( bif:contains( ?songTitle, 'Fallen and Soldier')). FILTER (LANG(?songTitle) = 'en') . } UNION { ?song rdf:type . ?song rdfs:label ?songTitle . OPTIONAL {?song dbpedia-owl:genre ?genre} . OPTIONAL {?song dbpprop:genre ?genre} . OPTIONAL {?song dbpedia-owl:releaseDate ?releaseDate} . FILTER ( bif:contains( ?songTitle, 'Fallen and Soldier')) . FILTER (LANG(?songTitle) = 'en') . } UNION { ?song rdf:type . ?song rdfs:label ?songTitle . OPTIONAL {?song dbpedia-owl:genre ?genre} . OPTIONAL {?song dbpprop:genre ?genre} . OPTIONAL {?song dbpedia-owl:releaseDate ?releaseDate} . FILTER ( bif:contains( ?songTitle, 'Fallen and Soldier')) . FILTER (LANG(?songTitle) = 'en') . } } LIMIT 1 Hope it works for you now." "Extracting Date Information" "uHello Everyone, First off, I want to apologize in advance for any obvious questions I may ask. I'm new to this list and new to dbpedia (and to rdf, sparql, etc etc). I've been playing around with using the dbpedia data set to extract date information about historical events. The query that I'm running through snorql is: SELECT ?x ?y WHERE { ?x dbpedia2:date ?y FILTER regex(str(?x), \"Battle\") } LIMIT 50 This returns results quickly, but the vast majority of the date information is: \"–\"@en Looking through the source on wikipedia it seems that many dates are structured like: date=[[September 12]] – [[September 15]], [[1814]] which creates hyperlinked date information. I'm guessing that the dbpedia extraction algorithm ignores the hyperlinked data and just pulls out the –? My question is: can this data be pulled out in any other way or extracted in full the next time the dbpedia data set is updated? uHi Steven, And welcome to the list :) You've spotted a bug, or kind of a bug. Our infobox extraction algorithm supports the extraction of dates (see e.g. the \"date\" property in doesn't seem to) support time spans (e.g. in Battle_of_Baltimore). my question @all: How to best represent timespans in RDF? Can't offer you an immediate solution, sorry, but when someone fixes this bug, the next dataset should have your data :) Please feel free to submit a bugfix yourself. Our code is open source, available at Cheers, Georgi uHello! I guess you can use the ontology at a use-case, they are equivalent) - it'd look like: :t a tl:Interval; tl:start \"1977-07-07^^xsd:date; tl:end \"1983-08-05\"^^xsd:date. Cheers! y" "Chinese mapping" "uDear all I found that there is no Chinese mapping on the mappings wiki [1] so far and tried to edit a mapping. However, I cannot create a mapping page. Shall I get the authority to edit? Can anyone help? [1] Cheers, Dongpo Dear all I found that there is no Chinese mapping on the mappings wiki [1] so far and tried to edit a mapping. However, I cannot create a mapping page. Shall I get the authority to edit? Can anyone help? [1] Dongpo uHi Dongpo, Did you try this namespace? Dimitris On Thu, Mar 20, 2014 at 4:13 PM, Dongpo Deng < > wrote: uHi Dimitris I think I used namespace, for example, I'd like to map mountain. I tried to create the page However, I cannot see where I can create the page. Is there any thing wrong? Dongpo On Mon, Mar 24, 2014 at 7:56 PM, Dimitris Kontokostas < >wrote: uHi Dongpo, When you are authenticated as user you have the option to create the page. Note that you need to login first and have editor rights for your account. If you don't have editor rights yet then please let me know your username. Roland On 24-03-14 16:28, Dongpo Deng wrote: uHi Roland Yes, I'm requesting editor account. My account is \"Dongpo\". Thank you for help! Dongpo On Tue, Mar 25, 2014 at 3:07 AM, Roland Cornelissen < >wrote: uHi, Roland Many thanks for the help. I can edit now. :) Dongpo On Tue, Mar 25, 2014 at 5:33 PM, Roland Cornelissen <" "GSoC 2016 DBpedia and Topic Modelling project announcement" "uDear DBpedia community, I'm Wojtek and hereby I would like to introduce myself as this year's Google Summer of Code student. My contribution to DBpedia will be mining topic models and extending DBpedia Spotlight's functionality by predicting the topics of the annotated document. Attached you can find the abstract of my proposal. Best regards, Wojtek Lukasiewicz *Abstract*: DBpedia, a crowd- and open-sourced community project extracting the content from Wikipedia, stores this information in a huge RDF graph. DBpedia Spotlight is a tool which delivers the DBpedia resources that are being mentioned in the document. Using DBpedia Spotlight to extract and disambiguate Named Entities from Wikipedia articles and then applying a topic modelling algorithm (e.g. LDA) with URIs of DBpedia resources as features would result in a model, which is capable of describing the documents with the proportions of the topics covering them. But because the topics are also represented by DBpedia URIs, this approach could result in a novel RDF hierarchy and ontology with insights for further analysis of the emerged subgraphs. The direct implication and first application scenario for this project would be utilizing the inference engine in DBpedia Spotlight, as an additional step after the document has been annotated and predicting its topic coverage. Dear DBpedia community, I'm Wojtek and hereby I would like to introduce myself as this year's Google Summer of Code student. My contribution to DBpedia will be mining topic models and extending DBpedia Spotlight's functionality by predicting the topics of the annotated document. Attached you can find the abstract of my proposal. Best regards, Wojtek Lukasiewicz Abstract : DBpedia, a crowd- and open-sourced community project extracting the content from Wikipedia, stores this information in a huge RDF graph. DBpedia Spotlight is a tool which delivers the DBpedia resources that are being mentioned in the document. Using DBpedia Spotlight to extract and disambiguate Named Entities from Wikipedia articles and then applying a topic modelling algorithm (e.g. LDA) with URIs of DBpedia resources as features would result in a model, which is capable of describing the documents with the proportions of the topics covering them. But because the topics are also represented by DBpedia URIs, this approach could result in a novel RDF hierarchy and ontology with insights for further analysis of the emerged subgraphs. The direct implication and first application scenario for this project would be utilizing the inference engine in DBpedia Spotlight, as an additional step after the document has been annotated and predicting its topic coverage. uHi Wojtek, Welcome to the DBpedia Discussion mailing list. For the people interested in your project I will mention following resources: The github repo is accessible under [1] The weekly reports and documentation will be available in the wiki [2] [1] [2] Cheers, Alexandru On Mon, May 9, 2016 at 4:50 PM, Wojtek Lukasiewicz < > wrote: uHey Wojtek, I strongly encourage you to reuse the fact extractor as much as you can if it is relevant to your work: Cheers, On 5/10/16 15:10, Alexandru Todor wrote: uHi Wojtek, Feel free to poke me if you need any help regarding spotlight Cheers, David On Wed, May 11, 2016 at 10:51 AM, Marco Fossati < > wrote:" "Subclass instances not included in SPARQL result, shouldn't they be?" "uGreetings, I noticed a discrepancy between my expectations and the results I get from a few SPARQL queries on DBPedia: Querying DBPedia for a listing of all television stations returns no results: SELECT ?x WHERE { { ?x rdf:type } } However if I specify to include instances of subclasses one inheritance level down, results do appear: SELECT ?station ?subclass WHERE { ?station rdf:type ?subclass. ?subclass rdfs:subClassOf } LIMIT 100 I expected the first query to return the results of the second (with additionally all the transitive subclasses), and I just wanted to ask on this list, is this an incorrect expectation? Shouldn't the query engine automatically infer that instances of subclasses are also instances of superclasses and include those in the result? I thought this kind of useful inferencing was one thing that gave SPARQL it's shine! I have pointed out this issue in this blog post . Thanks for your time! Best regards, Curran Kelleher Greetings, I noticed a discrepancy between my expectations and the results I get from a few SPARQL queries on DBPedia: Querying DBPedia for a listing of all television stations returns no results: SELECT ?x WHERE { { ?x rdf:type < } However if I specify to include instances of subclasses one inheritance level down, results do appear: SELECT ?station ?subclass WHERE { ?station rdf:type ?subclass. ?subclass rdfs:subClassOf < } LIMIT 100 I expected the first query to return the results of the second (with additionally all the transitive subclasses), and I just wanted to ask on this list, is this an incorrect expectation? Shouldn't the query engine automatically infer that instances of subclasses are also instances of superclasses and include those in the result? I thought this kind of useful inferencing was one thing that gave SPARQL it's shine! I have pointed out this issue in this blog post . Thanks for your time! Best regards, Curran Kelleher uOn 5/5/11 7:12 PM, Curran Kelleher wrote: Translation: I assumed reasoning with full transitive closure is on by default. When an endpoint is exposed to the InterWeb for anyone to use, that mode is impractical and self fulfilling re. deliberate or inadvertent DOS. Thus, we make reasoning optional via inference rules and sparql pragmas. See comments above. Yes! And if its smart then even better :-) Please update your post bearing in mind my explanation and links proving my point below: Links: 1." "Information on infoboxes in DBPedia" "uHello there, (Not sure if this email worked the first time, so I send it again) I am kind of new to DBPedia so I just read the documentation provided at home as well as the article \"DBpedia - A Crystallization Point for the Web of Data\". As far as I understand, infoboxes are used to \"export\" wikipedia content to DBPedia. So I took a look at infobox here includes links to UniProt proteins, then I went to related to UniProt in there, maybe some thing like rdfs:seeAlso, but it is not. So, what kind of information from the infoboxes is actually included into DBPedia? Why links in infobox are not in DBPedia? Thansk in advanced for any help you can provide me in this regard. Cheers, Leyla Hello there, (Not sure if this email worked the first time, so I send it again) I am kind of new to DBPedia so I just read the documentation provided at home as well as the article 'DBpedia - A Crystallization Point for the Web of Data'. As far as I understand, infoboxes are used to 'export' wikipedia content to DBPedia. So I took a look at Leyla uThe infobox is rendered by the template {{PBB|geneid=8021}} (see [1]). The DBpedia extraction currently only looks at the template data provided on the page of a resource. It does not incorporate template rendering information. Since all the data you see in the HTML table is rendered by the PBB template, the extraction does not fetch these data. The only information that can be extracted here is the ID that the gene has in the Entrez Life Sciences database. A mapping for this information already exists [2] and the data will be available in the next release. Cheers, Max [1] [2] On Thu, Aug 18, 2011 at 11:47, Leyla Jael García Castro < > wrote:" "DBPedia Query" "uHi, I am using dbpedia for one of my project involving image search and Wikipedia. When I make a sparql query for Five_Point_Someone_%E2%80%93_What_not_to_do_at_IIT%21 or Five_Point_Someone_–_What_not_to_do_at_IIT%21 it returns me empty results, though the page for the query in the DBPedia exists. I tried various combinations like changing – with %E2%80%93 and ! with %21 but none of them worked. Any help will be really appreciated. Thanks and Regards Abhishek P.S : Thanks for an excellent product, it has been really useful. And sometime even more than wikipedia :) Hi, I am using dbpedia for one of my project involving image search and Wikipedia. When I make a sparql query for Five_Point_Someone_%E2%80%93_What_not_to_do_at_IIT%21 or Five_Point_Someone_–_What_not_to_do_at_IIT%21 it returns me empty results, though the page for the query in the DBPedia exists. I tried various combinations like changing – with %E2%80%93 and ! with %21 but none of them worked. Any help will be really appreciated. Thanks and Regards Abhishek P.S : Thanks for an excellent product, it has been really useful. And sometime even more than wikipedia :) uHi Abhishek, On 06/13/2012 07:51 PM, Abhishek Gupta wrote: I've tried the following sparql query, and it worked for me: SELECT ?o ?p { ?o ?p } Are you trying to access DBpedia via Jena? uHi Thanks for your response. I am using the web based API, so my query (for five point someone_-_what not to do at iit!) is : But, When I do the following (analogously for Chetan Bhagat) it works. Regards Abhishek On 14 June 2012 00:05, Mohamed Morsey < >wrote: u0€ *†H†÷  €0€1 0 + uHi Kingsley, Thanks for the help. Problem has been resolved. The issue was that say we have ! in our original query. In the html query we should write %2521 (the web call) which translates to %21 (in the query form) as a query, and finally ! (from query to before final result) in the final execution. So, in a way we should have the conversion twice before making any such call. ! -> %21 -> %2521 Regards Abhishek On 14 June 2012 00:41, Kingsley Idehen < > wrote:" "Issues Retrieving resource pages on dbpedia.org containing round braces ()" "uHi, Long time listener, first time poster. Was wondering if anyone may have experienced the following behaviour when attempting to fetch resource pages from dbpedia.org? If I make a request with the following characteristics the RDF document is returned as expected. - URL: - Type: GET - Accept: application/rdf+xml However when making a request with the following characteristics I receive an empty RDF document - URL: - Type: GET - Accept:application/rdf+xml The URL in the second case is referenced as a disambiguate of the first, and if I view the same URL in a browser I get the redirect to: which contains some data which I would expect should be on the resource page. Of note, this just began happening sometime during the past few days, prior to that I was able to fetch data from pages with round braces in the URL. Anyone else seeing similar behaviour? Thanks! Matthew Hi, Long time listener, first time poster. Was wondering if anyone may have experienced the following behaviour when attempting to fetch resource pages from dbpedia.org? If I make a request with the following characteristics the RDF document is returned as expected. - URL: Matthew u uHi All, Just in case anyone is interested. After further investigation I discovered that a redirect from /resource/Companion_%28manga%29 to / data/Companion_(manga).xml is taking place. The URL of the redirect, which is unencoded returns an empty RDF document. However, a request for the same encoded URL /data/Companion_%28manga%29.xml returns the populated RDF. So, it appears to me that there is some issue with the redirect and URL encoding. Here is some relevant data from the traces I've done. Initial Post: GET /resource/Companion_%28manga%29 HTTP/1.1 Accept-Encoding: identity Content-Length: 0 Connection: close Accept: application/rdf+xml User-Agent: Python-urllib/2.5 Host: dbpedia.org Content-Type: application/x-www-form-urlencoded HTTP/1.1 303 See Other Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-solaris2.10-64 VDB Connection: close Date: Sat, 16 May 2009 21:19:50 GMT Accept-Ranges: bytes TCN: choice Vary: negotiate,accept Content-Location: Companion_(manga).xml Content-Type: application/rdf+xml; qs=0.95 Location: Content-Length: 0 Redirect: GET /data/Companion_(manga).xml HTTP/1.1 Accept-Encoding: identity Host: dbpedia.org Connection: close Accept: application/rdf+xml User-Agent: Python-urllib/2.5 HTTP/1.1 200 OK Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-solaris2.10-64 VDB Connection: close Date: Sat, 16 May 2009 21:19:50 GMT Accept-Ranges: bytes Content-Type: application/rdf+xml; charset=UTF-8 Content-Length: 167 xml version='1.0' encoding='%SOUP-ENCODING%' and Request for /data/Companion_%28manga%29.xml GET /data/Companion_%28manga%29.xml HTTP/1.1 Accept-Encoding: identity Content-Length: 0 Connection: close Accept: application/rdf+xml User-Agent: Python-urllib/2.5 Host: dbpedia.org Content-Type: application/x-www-form-urlencoded HTTP/1.1 200 OK Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-solaris2.10-64 VDB Connection: close Date: Sat, 16 May 2009 21:21:52 GMT Accept-Ranges: bytes Content-Type: application/rdf+xml; charset=UTF-8 Content-Length: 5262 xml version='1.0' encoding='%SOUP-ENCODING%' Snipped for Brevity > Hi," "Negation in Sparql" "uHello, someone could help me with this query? SELECT distinct ?x ?o WHERE { ?x rdfs:label ?lbl . filter(bif:contains(?lbl, \"Cole\")). filter (!bif:exists ((select (1) where { ?x a dbpedia-owl:Person } ))) } I would like to retrieve all the entities that in their labels contain \"Cole\" and are not Person. I don't know why In many cases it matches dbpedia-owl:Person thanks, Rocco. Hello, someone could help me with this query?  SELECT distinct ?x ?o  WHERE {  ?x rdfs:label ?lbl .  filter(bif:contains(?lbl, 'Cole')).  filter (!bif:exists ((select (1) where { ?x a dbpedia-owl:Person } )))  } I would like to retrieve all the entities that in their labels contain 'Cole' and are not Person. I don't know why In many cases it matches dbpedia-owl:Person Rocco. uI thought that maybe this would work, but it didn't, and I'm curious why. (Nat King Cole still shows up, and saying that he's a dbpedia-owl:Person.) SELECT distinct ?x WHERE { ?x rdfs:label ?lbl . filter(bif:contains(?lbl, \"Cole\")). ?x a ?xClass . FILTER (?xClass != dbpedia-owl:Person) . } (By the way, there's no point in SELECTing ?o if nothing in the query is going to bind it.) Bob On 11/22/2010 9:24 AM, Rocco Tripodi wrote: uOn 11/22/10 10:48 AM, Bob DuCharme wrote: I'm not a SPARQL expert, but I assume it works for '?x a dbpedia-owl:Artist' and any of the other 14 things Nat_King_Cole is other than a dbpedia-owl:Person. uOh, duh Bob. I should have had it select xClass as well as a test. Bob On 11/22/2010 12:55 PM, Tim Finin wrote: uThe query works well: SELECT distinct ?x WHERE { ?x rdfs:label ?lbl . filter (!bif:exists ((select (1) where { ?x a dbpedia-owl:Person } ))) filter(bif:contains(?lbl, \"Cole\")) } but the SPARQL Explorer retrieves also the redirectssuch as so OPTIONAL { ?x dbpedia2:redirect : } FILTER (!bound(?w)) Unfortunately, the query returns many other Person with other Properties. I don't figure out how to express not-Person in DBpedia. Is there a SuperClass?. The query works well: SELECT distinct ?x WHERE {  ?x rdfs:label ?lbl . filter (!bif:exists ((select (1) where { ?x a dbpedia-owl:Person } ))) filter(bif:contains(?lbl, 'Cole')) } but the SPARQL Explorer retrieves also the redirectssuch as SuperClass?. uHi Rocco, Do either of the following example queries using the Yago classes as input inference rules deliver the require superclass effect you are seeking ? Inference Context Yago DEFINE input:inference \" SELECT distinct ?x ?type WHERE { ?x rdfs:label ?lbl . ?x rdf:type ?type. filter(bif:contains(?lbl, \"Cole\")). filter (!bif:exists ((select (1) where { ?x a dbpedia-owl:Person } ))) } Inference Context Yago DEFINE input:inference \" SELECT distinct ?x ?type WHERE { ?x rdfs:label ?lbl . ?x rdf:type ?type. filter(bif:contains(?lbl, \"Cole\")). filter (!bif:exists ((select (1) where { ?x a foaf:Person } ))) } Note the following Tutorial slides also provide additional examples on the use of the Yago classes as input inference rules in queries against DBpedia: I hope this helps Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 22 Nov 2010, at 22:45, Rocco Tripodi wrote: uHi Hugh, thanks for help. The query DEFINE input:inference \" SELECT distinct ?x ?type FROM WHERE { ?x rdfs:label ?lbl . ?x rdf:type ?type. filter(bif:contains(?lbl, \"Cole\")). filter (!bif:exists ((select (1) where { ?x a dbpedia-owl:Person } ))) } works well. It includes only the but it is not a problem. Regards Rocco 2010/11/23 Hugh Williams < >" "JSON Format Errors" "uHello everyone, I'm using dbpedia online access functionality, retrieving the queries in JSON format, and I've been experiencing a couple of issues. Most often I get the error \"Transaction timed out\" or \"JSONDecodeError\" so I set up my code to wait 10 seconds before trying again, but with certain queries I have to wait a long time (like a minute) before finally getting the data, can anyone explain why this happens? Also, I've noted that certain queries simply don't seem to exist in dbpedia. For instance, there's the wikipedia page error, is this supposed to happen? Doesn't dbpedia mirror every wikipedia page (excluding recent ones, of course) ? I love dbpedia when I get it to work correctly, so I'd much appreciate if you could help me enjoy it as much as possible. Cheers, Daniel Loureiro Hello everyone, I'm using dbpedia online access functionality, retrieving the queries in JSON format, and I've been experiencing a couple of issues. Most often I get the error 'Transaction timed out' or 'JSONDecodeError' so I set up my code to wait 10 seconds before trying again, but with certain queries I have to wait a long time (like a minute) before finally getting the data, can anyone explain why this happens? Also, I've noted that certain queries simply don't seem to exist in dbpedia. For instance, there's the wikipedia page Loureiro uHi Daniel I don't represent either OpenLink/DBPedia but I suspect that the JSON is being generated on the fly when you request it so for a topic like the US which has a large number of Triples this might take a rather long time hence the time outs and/or JSON errors you've been experiencing. Are you reliant on using the JSON (i.e. consuming it immediately in a Javascript application) or are you using a general purpose RDF/Semantic Web API? If you are doing the latter then it may be easier and more efficient to just pull back the RDF/XML or Turtle representation of the data and transform it to JSON on your end. Regards, Rob Vesse From: Daniel Loureiro [mailto: ] Sent: 30 December 2010 11:38 To: Subject: [Dbpedia-discussion] JSON Format Errors Hello everyone, I'm using dbpedia online access functionality, retrieving the queries in JSON format, and I've been experiencing a couple of issues. Most often I get the error \"Transaction timed out\" or \"JSONDecodeError\" so I set up my code to wait 10 seconds before trying again, but with certain queries I have to wait a long time (like a minute) before finally getting the data, can anyone explain why this happens? Also, I've noted that certain queries simply don't seem to exist in dbpedia. For instance, there's the wikipedia page error, is this supposed to happen? Doesn't dbpedia mirror every wikipedia page (excluding recent ones, of course) ? I love dbpedia when I get it to work correctly, so I'd much appreciate if you could help me enjoy it as much as possible. Cheers, Daniel Loureiro uOn 1/5/11 5:45 AM, Rob Vesse wrote: Rob, At the time of Daniel's post the DBpedia instance had issues re. number of concurrent connections from all over the Web. Thus, irrespective of representation, the problem would have been the same i.e., Daniel: I assume this problem no longer occurs? I just retrieved Happy New Year! Kingsley" "Abstract extraction problem" "uHi all, I have a problem when I am trying to use AbstractExtractor. I have done the instructions from [1] to make local mediawiki instance. Then, I tested local mediawiki instance like this : the result : xml version='1.0' encoding='%SOUP-ENCODING%' This is a text for testing I think there are no problem with my local mediawiki instance, but some problem appear when I am running AbstractExtractor.  I replaced this line  privatevalapiUrl=\" with privatevalapiUrl=\" Example the error: Mar 04, 2013 6:24:24 PM org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1 apply WARNING: error processing page 'title=Daftar negara bagian di Jerman;ns=0/Main/;language:wiki=id,locale=in' java.lang.Exception: Could not retrieve abstract for page: title=Daftar negara bagian di Jerman;ns=0/Main/;language:wiki=id,locale=in at org.dbpedia.extraction.mappings.AbstractExtractor.retrievePage(AbstractExtractor.scala:134) at org.dbpedia.extraction.mappings.AbstractExtractor.extract(AbstractExtractor.scala:66) at org.dbpedia.extraction.mappings.AbstractExtractor.extract(AbstractExtractor.scala:21) at org.dbpedia.extraction.mappings.CompositeMapping$$anonfun$extract$1.apply(CompositeMapping.scala:13) at org.dbpedia.extraction.mappings.CompositeMapping$$anonfun$extract$1.apply(CompositeMapping.scala:13) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:239) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:239) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59) at scala.collection.immutable.List.foreach(List.scala:76) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:239) at scala.collection.immutable.List.flatMap(List.scala:76) at org.dbpedia.extraction.mappings.CompositeMapping.extract(CompositeMapping.scala:13) at org.dbpedia.extraction.mappings.RootExtractor.apply(RootExtractor.scala:23) at org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1.apply(ExtractionJob.scala:29) at org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1.apply(ExtractionJob.scala:25) at org.dbpedia.extraction.util.SimpleWorkers$$anonfun$apply$1$$anon$2.process(Workers.scala:23) at org.dbpedia.extraction.util.Workers$$anonfun$1$$anon$1.run(Workers.scala:131) Any idea how to fix this problem ? Thank you ! [1]  Riko Adi Prasetya Faculty of Computer Science Universitas Indonesia Hi all, I have a problem when I am trying to use AbstractExtractor. I have done the instructions from [1] to make local mediawiki instance. Then, I tested local mediawiki instance like this : WARNING: error processing page 'title=Daftar negara bagian di Jerman;ns=0/Main/;language:wiki=id,locale=in' java.lang.Exception: Could not retrieve abstract for page: title=Daftar negara bagian di Jerman;ns=0/Main/;language:wiki=id,locale=in at org.dbpedia.extraction.mappings.AbstractExtractor.retrievePage(AbstractExtractor.scala:134) at org.dbpedia.extraction.mappings.AbstractExtractor.extract(AbstractExtractor.scala:66) at org.dbpedia.extraction.mappings.AbstractExtractor.extract(AbstractExtractor.scala:21) at org.dbpedia.extraction.mappings.CompositeMapping$$anonfun$extract$1.apply(CompositeMapping.scala:13) at org.dbpedia.extraction.mappings.CompositeMapping$$anonfun$extract$1.apply(CompositeMapping.scala:13) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:239) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:239) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59) at scala.collection.immutable.List.foreach(List.scala:76) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:239) at scala.collection.immutable.List.flatMap(List.scala:76) at org.dbpedia.extraction.mappings.CompositeMapping.extract(CompositeMapping.scala:13) at org.dbpedia.extraction.mappings.RootExtractor.apply(RootExtractor.scala:23) at org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1.apply(ExtractionJob.scala:29) at org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1.apply(ExtractionJob.scala:25) at org.dbpedia.extraction.util.SimpleWorkers$$anonfun$apply$1$$anon$2.process(Workers.scala:23) at org.dbpedia.extraction.util.Workers$$anonfun$1$$anon$1.run(Workers.scala:131) Any idea how to fix this problem ? Thank you ! [1] Indonesia uHi Riko, I updated the settings in the repository (although I don't think this is it) but can you pull and retry? If the problem persists, can you try to debug it and see where exactly in the retrievePage() function is the problem? e.g. test the generated url and see what you get Best, Dimitris On Mon, Mar 4, 2013 at 2:54 PM, riko adi prasetya < >wrote: uThis error occurs when getting the abstract from the modified MediaWiki fails three times (line 134 in AbstractExtractor). The actual error messages from these three tries are (or should be) written to the log file (line 128). These actual error messages would be very helpful. Could you send them to the list? Maybe you have to turn on logging first, I'm not sure. JC On Mon, Mar 4, 2013 at 1:54 PM, riko adi prasetya < > wrote: uHi Riko, I don't know what proxy means in this context, but I'm glad that the extraction is working for you now! Cheers, JC On Tue, Mar 5, 2013 at 4:36 PM, Riko Adi Prasetya < > wrote: uHi Riko, We had similar (proxy) problems in the past but we didn't documented them anywhere.Would you mind writing how you bypassed the proxy issue? You could make a pull request with your proxy-pom configuration (as a comment) and drop a couple of lines explaining it here: And of course, you can also add everything that you had to figure out on your own:) Thanks Dimitris On Tue, Mar 5, 2013 at 5:36 PM, Riko Adi Prasetya < >wrote: uHi Riko, This is weird. Could you please send us the whole stack trace? I don't think the extraction framework should try to access anything but localhost. Could be some kind of XML schema thing. If it is, we should probably turn it off. I still don't quite understand why you have to tell your JVM not to use a proxy for localhost. I guess the JVM picks up the proxy configuration from the operating system. Maybe you should configure the OS such that no proxy is used for localhost. Cheers, JC On Thu, Mar 7, 2013 at 11:36 AM, Riko Adi Prasetya < > wrote: uThanks. As I expected, the problem is that the XML parser tries to download a schema or DTD from www.w3.org, probably to validate the XML returned by the local MediaWiki. I'd like to look into it, but I don't know if I'll have time, so any help is welcome. This discussion may help: Cheers, JC On Mon, Mar 11, 2013 at 5:37 AM, Riko Adi Prasetya < > wrote:" "URI / IRI again" "uHi, the page missing a lot of data, so I went and checked the endpoints. Here's what I found. As I expected after our discussion last week about domains, IRIs and URIs, all chapters (except English) use the But I was a bit disappointed to see that contrary to the result of our discussion, a majority of chapters actually use URIs: - cs, en, fr, ja, ko, pl, pt use URIs - de, el, es, it, ru use IRIs Is this going to change? I'm currently preparing the datasets for the next DBpedia release and I would like them to be compatible with the data published by the individual chapters, but I'd also like us to move towards IRIs. What should I do? - extract URIs for the languages listed above and IRIs for all other languages - extract URIs for the languages listed above, additionally offer their datasets with IRIs, offer only IRIs for all other languages - extract URIs for en, additionally offer en datasets with IRIs, use only IRIs for all other languages - offer alternative datasets (URIs and IRIs) for all languages Either way, there is going to be confusion JC uHi Jona, The main branch should set the general direction. I think if you switch to \"IRIs only\" for the extraction, everyone will follow and we should also see more and more people fixing their tools. Cheers, Alexandru On 05/24/2012 01:07 AM, Jona Christopher Sahnwaldt wrote: u2012/5/24 Alexandru Todor < >: I Agree. uPortuguese will load what 3.8 generates, and generate compatible releases from that point on. On Thu, May 24, 2012 at 10:10 AM, Marco Amadori < >wrote:" "Work on DBpedia Release 3.9 starting + Mapping Sprint until June 30th" "uDear all, I want to share the good news that Christopher Sahnwaldt is starting to work on the DBpedia Release 3.9. As with the previous releases, it is planned that Christopher runs all extractors for all languages. The resulting dumps will be provided for download via the DBpedia download page ( DBpedia SPARQL endpoint. We hope to be able to publish the new release end of July or early August. As a lot of additional infobox-to-ontology mappings for various languages have already been entered by the mapping editors community into the DBpedia Mapping Wiki ( release, we hope to be able to provide clean data for even more infoboxes in even more languages with the new release. With this email, I would also like to ask all members of the mapping editors community for their help with the upcoming DBpedia 3.9 release in the form of a Mapping Spring: 1. Could you please check whether the mappings that you have entered into the wiki over the last year still work correctly? 2. If you still want to refine and extend mappings, now would be the perfect time to do so. 3. In order to help increasing the infobox coverage of the new release, it would also be great if you map additional templates or additional properties of existing templates to the ontology. For helping you see which widely used templates still require additional property mappings, Christopher has updated the Mapping Wiki statistics for all languages: For the English Wikipedia edition, we can see for instance at that we can still get much better with infoboxes like \"Infobox ship characteristics\" or \"Infobox football club\" or \"Chembox\" amongst a lot of others :-) The statistics for German, Spanish and French are found below and also still show lots of gaps for these important languages: Christopher will use all mappings that are entered into the Mapping Wiki until June 30th for the new release. It would thus be great if as many editors as possible participate in the mapping sprint and we as a community try to increase the mapping coverage as far as possible until this date. Lots of thanks already in advance to all mapping editors who participate. Let's try to make the DBpedia 3.9 Release even better than the 3.8 Release! Cheers, Chris uThat's really cool! Thanks for letting us know. Would it be possible to pre-populate template mapping page on the wiki when hitting \"create\" with a skeleton containing current template properties? That would ease a bit the mapping effort IMHO. What do you think? Cheers Andrea 2013/6/13 Christian Bizer < >" "Subject: SPARQL and DBPedia - getting the base url from a wiki page redirect name" "uHi folks , My problem is simply this when I am querying with 'Barack', I am not getting the summary of 'Barack_Obama' dbpedia page. I know what is the procedure of getting page redirect names from a base url name . But , is the other way round possible ? Can I get 'Barack_Obama' by sending a query with 'Barack' ? Please help me with such SPARQL query uHi Try this query: PREFIX rdfs: SELECT distinct * WHERE { ?iri rdfs:label ?label. ?label bif:contains \"barak\" FILTER( langMatches(lang(?label), \"en\")). } On Sun, Aug 12, 2012 at 8:12 AM, Somesh Jain < > wrote: ucheck this Query this should relate to you barack obama or the first related elements , i've used it in keyword search and it's tested as well query = \"select distinct ?subject ?literal ?redirects ?disamb where{\" + \"?subject ?literal.\" + \"optional { ?subject < \"optional {?subject < \"optional { ?subject < \"Filter ( !(?type=< \"?literal bif:contains '\" + bifcontains + \"'.}\" + \"limit\" + \"100\"; IN your code check if that the entity doesn't contain a ?redirects or ?disamb if it does so your entity will be the ?redirects one or the ?disamb , this is a good practice to avoid redirections or disambiguity entities On Wed, Aug 15, 2012 at 11:54 AM, Saeedeh Shekarpour < > wrote:" "Link text from Wikipedia" "uHi, DBPedia has a dataset called \"Pagelinks\" that contains all links between wiki articles. However, there is no way to find out the link text for these links.Does anybody know how I can extract this information somehow? For example, the wikipedia page about \"Thanksgiving\" contains a link to the wikipedia article \"Domesticated_turkey\", however the link text for that link is \"turkey\" (and not \"Domesticated turkey\"). So basically what I want to know is that on the wikipage about \"Thanksgiving\" there is a link that links with the text \"turkey\". Is this data somehow already available? If not already available, could someone suggest how I could go about to generate this dataset myself?For example, could I somewhere easily get access to a complete dump of all (English) wikipedia articles? Thanks/Omid Trött på krångliga mejladresser? Skaffa en live.se-adress här! { margin:0px; padding:0px } body.hmmessage { FONT-SIZE: 10pt; FONT-FAMILY:Tahoma } Hi, DBPedia has a dataset called \"Pagelinks\" that contains all links between wiki articles. However, there is no way to find out the link text for these links. Does anybody know how I can extract this information somehow? For example, the wikipedia page about \"Thanksgiving\" contains a link to the wikipedia article \"Domesticated_turkey\", however the link text for that link is \"turkey\" (and not \"Domesticated turkey\"). So basically what I want to know is that on the wikipage about \"Thanksgiving\" there is a link that links with the text \"turkey\". Is this data somehow already available? If not already available, could someone suggest how I could go about to generate this dataset myself? For example, could I somewhere easily get access to a complete dump of all (English) wikipedia articles? Thanks /Omid Du har väl läst dagens Nemi? MSN Serier uHi Omid, Those are usually available here : Unfortunately, the server seems down right now. The web pages are still in Google cache, but the dump is not ;-) Hope this helps, Nicolas Raoul uHi Omid, The Metaweb WEX corpus is designed specifically for these kinds of tasks" "PhD Studentship in Linked Data Integration for Social Science Applications" "uHi All a new PhD opportunity in Dublin: Closing Date:12 Noon on 4th March 2016 or until filled Post Status: 4 year PhD Studentship Department: ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin Benefits : Payment of tax - free stipend and full academic fees for EU students and partial fees for non-EU students Research Topic The Seshat Global Databank ( data on every human society throughout history and prehistory to better understand the dynamics of human social evolution. The ADAPT Centre has partnered with Seshat to provide the sophisticated Linked Data –based data curation infrastructure for semi-supervised data collection, refinement and publication. But we want to do more. There is increasing amounts of data published on the web that could be integrated into Seshat, especially structured data such as historical GIS. However there are a wide variety of information sources and standards, increasing data integration costs. This PhD will develop new methods and tools to make data integration more flexible and self-managing within the Seshat project. This post is part of the new ADAPT Centre and will work closely with the ADAPT and Seshat affiliated H2020 project ALIGNED (www.aligned-project.eu). Application Procedure Please apply via email to and include : a) Targeted cover letter (600-1000 words) expressing your suitability for a position b) Complete CV Please include the reference code E2_PhD7 on all correspondence. Further details at: rgds rob" "Missing DBpedia URIs?" "uHello! I spotted a few missing DBpedia URIs, both on the currently live dataset and on DBpedia live, for example: exist at there is nothing there. What could explain that some Wikipedia pages are left off the extraction process? This page, in particular, is a bit odd: it has been in existence since 2010, and has a very detailed infobox. On that note, I was wondering if the DBpedia team ever considered using persistent URIs for DBpedia terms - we are using DBpedia to tag programmes at the BBC, and the fluctuation in DBpedia URIs is very hard to deal with. There are persistent identifiers accessible through the Wikipedia API that would be much more useful for keying DBpedia URIs than ever-changing URL slugs. DBpedia Lite [0] uses those and as a result has very stable URIs. Best, y [0] uI cannot reproduce. This works for me: Cheers, Pablo On Mon, Sep 5, 2011 at 3:55 PM, Yves Raimond < > wrote: uOk - I am *really* confused now - I promise this page didn't exist 10 minutes ago! (I have logs to prove it, in case it's needed) y On Mon, Sep 5, 2011 at 3:35 PM, Pablo Mendes < > wrote: uWe have a little script that automatically creates the page when someone complains that it doesn't exist. ;) Jokes apart, could it have been DNS or a temporary server outage at the hosting side? Otherwise it may be worth pinging Openlink about this one. Cheers, Pablo On Mon, Sep 5, 2011 at 4:38 PM, Yves Raimond < > wrote: We have a little script that automatically creates the page when someone complains that it doesn't exist. ;) Jokes apart, could it have been DNS or a temporary server outage at the hosting side? Otherwise it may be worth pinging Openlink about this one. Cheers, Pablo On Mon, Sep 5, 2011 at 4:38 PM, Yves Raimond < > wrote: Ok - I am *really* confused now - I promise this page didn't exist 10 minutes ago! (I have logs to prove it, in case it's needed) y On Mon, Sep 5, 2011 at 3:35 PM, Pablo Mendes < > wrote: > I cannot reproduce. > This works for me: > uIt looks like it is an encoding-related issue - so it's probably my fault! y On Mon, Sep 5, 2011 at 3:38 PM, Yves Raimond < > wrote: uIl 05/09/2011 07:42, Yves Raimond ha scritto: This is the normal behavior. It depends on the DBpedia encoding Scheme [1]. With brackets the URI is not valid, while with their 2-digits hexadecimal representation is correct. [1] cheers, roberto uIt would really help if Wikipedia (and dbpedia) redirect to the canonical URL for a page, rather than serving up a 200. In Wikipedia links, they never percent-encode brackets. This is the ruby code that I use to convert page titles to URIs: def escape_title(title) URI::escape(title.gsub(' ','_'), ' ?#%\"+=') end nick. On 05/09/2011 15:42, \"Yves Raimond\" < > wrote: nick. This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this." "SKOS property in DBPedia" "uHi Richard, It seems very strange for me to relate a DBPedia resource and its category by skos:subject. Although its rdfs:domain is not defined, I think skos:subject (and its super property dc:subject) is normally used to relate a (creative) work and its subject, not a concept and other concept, nor such real world thing as a person and a concept. How do you think ? 08/01/24 Richard Cyganiak< > uOn 24 Jan 2008, at 03:16, KANZAKI Masahide wrote: I agree that the most typical use of skos:subject is to relate creative works and concepts. We decided early on to model the Wikipedia category system using SKOS, and I think it's a very good match. The only problem was: What property to use for indexing DBpedia resources into their respective categories? We couldn't find any indication in the SKOS documentation that skos:subject should be used *only* for creative works. I also asked on the SKOS list if this was okay, and the consensus seemed to be that it's a bit strange, but not illegal. So we went with skos:subject. Maybe there is a better choice? Do you have a suggestion for another, more appropriate property to use in place of skos:subject? Cheers, Richard u2008/1/24, Richard Cyganiak: Well, there is no domain restriction for skos:subject, so it's 'legal' to relate anything and skos:Concept with skos:subject, but sometimes inappropriate. Let's think the following statement: dbpedia:Tim_Berners-Lee skos:subject dbpedia:Category:Living_people . I don't think it's good idea to say that \"TimBL's subject (or topic) is Living_people\" although we can say that \"TimBL is categorized as Living_people\". A person can be a subject of some works, and a person may be interested in some topics, but I can't imagine that a person has a subject or a topic I've tried similar approach that used Wikipedia as PSI of a subject, and used Wikipedia category as basis to categorize these subjects. Since I couldn't find appropriate terms for this purpose, I defined own vocabulary to describe them. can be used to relate DBPedia resource and its category, though the vocabulary is not well known (so far ;-). Or, since DBPedia already defined many terms for own project, it'd be no problem to define one more property for category relationship. best regards, uHmmm, re-reading some of the SKOS docs I get the feeling that skos:subject is indeed appropriate only for documents: | These properties [including skos:subject] can be used for subject | indexing of information resources on the web. Here 'subject indexing' | means the same as 'indexing' as defined by Willpower Glossary. The Willpower Glossary says: | indexing: intellectual analysis of the subject matter of a document | to identify the concepts represented in it, and allocation of the | corresponding preferred terms to allow the information to be retrieved So, skos:subject is intended for use on information resources, that is, documents. DBpedia resources in general are not documents. I'm logging this as a bug in the tracker. I think that a new property in the DBpedia namespace is perhaps the simplest solution, e.g. dbpedia:category. Thoughts anyone? Richard On 24 Jan 2008, at 11:06, KANZAKI Masahide wrote: uHi Richard, What is a category for DBPedia? Answering to this question will tell you if it is the good thing to do or not. If one is answering that it is a Wikipedia Category, then I will answer that it is not the good thing to do in my opinion. If one ask me why? I would answer that it is become many of the wikipedia categories are classes and in such a case why not defining them as a Class, and not a \"category\" (that could be considered a class for some sense of that class)? What if a category is not a class but something else? It is where the cleaning process is starting :) Take care, Fred uHi Fred 2008/1/24, Frederick Giasson : Wikipedia category is a classification, but not a Class in RDF/OWL sense (in most cases). For example, a category 'Internet_history' (taken from Tim Berners-Lee example) has 'Internet' as its super category. It's maybe OK as classification, but doesn't work as Classes, because 'Internet_history' cannot be a subClassOf 'Internet' (perhaps subClassOf 'History'). There are such categories as 'Tokyo', which could be used as labes of classification, but not even Concepts (or think it like 'Tokyo as concept' ? sounds tricky). I do support to use wikipedia concepts for useful information. Just concern how they should be related to DBPedia resources. best, uFred, On 24 Jan 2008, at 13:31, Frederick Giasson wrote: No. The Wikipedia category system is simply not appropriate as a class hierarchy. That's not a bug; it serves its purpose well, and the Wikipedia community likes it that way. It is essentially a tagging system, where tags themselves can be tagged. See [1] for an in-depth analysis. SKOS is an appropriate vocabulary for modelling the Wikipedia category system. RDFS is not. Hence, we do need a property for indexing arbitrary non-document resources into skos:Concepts. (By the way: Yes, class hierarchies can be created from the Wikipedia category system, e.g. by using cleanup heuristics and combining it with other data sources such as WordNet. Research on this is ongoing in the DBpedia project and elsewhere.) Richard [1] uHi Masahide, You just hit the nail :) You said it: in most cases. In fact, Wikipedia categories can be many things: named entities, concepts, relations, (something else?) So, why defining all wikipedia \"categories\" as \"categories\" (classification purposes) when it is *clearly* not the case? In my point of view, it just make things worse. In case of ambiguity, discard them, don't pollute the dataset with false assertions. In would encourage, everyone on that list, to read and re-read the work of Fabian on Yago: I think this would be a relation. Tricky, but it just found the gray area between a \"subject concept\" and a \"named entity\" I think. What is a \"Wikipedia concept\"? I mean, is it different from a \"concept\"? How they should be related to DBPEdia resources? With great care. Take care, Fred uHi Richard, Sure, but we are talking about the use of wikipedia categories in dbpedia; I am not questionning the use of categories in WIkipedia. RDFS is not in some case, but it is in some other. Exactly what I am talking about. And as I said in my reply to Masahide: should be handled with grat care. Many others? For sure, there are more than 50 papers that talk about that :) In multiple research groups around the globe. The only thing I say is that it is neither black or white, but there are much gray in there. And putting everything in the same bag doesn't help making it more useful. Take care, Fred uHi Richard See also the other thread about deprecation of skos:subject (I suggest to close the current thread and follow-up on that one to avoid parallel discussions) Richard Cyganiak a écrit : Indeed, but the genral notion of resource has evolved in the history of the Web from documents to more and more abstract resources. As you know, I belong to people who consider that there is a continuum from physical documents to abstract concepts, and any distinct limit between \"document\", \"information resource\", \"named entity\", \"concept\" is arbitrary. So, if we want to avoid endless discussions about that, let's assume that Everything is a resource Everything (including concepts themselves) can be indexed by concepts in order to be retrieved A generic mechanism for that should encompass all resources Indexing writers, musicians, buildings by art style, towns and countries by used languages or religions, restaurants by food type etc make sense whenever this is intended to retrieve those resources, not to declare classes and attributes. So, using skos:subject for grouping and retrieving DBpedia resources by Wikipedia categories makes perfect sense to me. That said, I agree the example pointed out by Masahide can seem weird. dbpedia:Tim_Berners-Lee skos:subject dbpedia:Category:Living_people but, as said in the other thread, not because of the use of skos:subject, but because of the strangeness of the Wikipedia category \"Living people\", which unfortunately tends to be not very reliable, and subject to permanent modification That said, it is perfectly functional : it supports queries retrieving all people indexed in this category. And you don't ask more to such a declaration. See above. Don't put your foot on this slippery slope > I'm logging this as a bug in the tracker. I think that a new property Well, I don't think it's a good idea. How will you federate this property with other indexing pointers? You got some :-) Cheers Bernard uOn 24 Jan 2008, at 14:09, Frederick Giasson wrote: They certainly all are skos:Concepts. SKOS was created for the purpose of representing exactly that sort of things uHi Richard, uOn 24 Jan 2008, at 15:49, Frederick Giasson wrote: No. I said that each Wikipedia category is a skos:Concept. The Wikipedia category system is a concept scheme, in the SKOS sense. There are no “named entities”, or “relations” in a concept scheme. The category \"Berlin\" is still a concept. It might have a corresponding named entity somewhere, but that's another story. The named entity is not part of the concept scheme. Richard uHi Richard, My last answer to that thread since it could evolves that way for years :) It is all about definitions and our views of these things It is what I was saying: a Wikipedia Category can be: a named entity (Elvis), a concept, a relation, and possibly other things. So it is why I was making sure that all these three things where concepts for you (because you said *all*). However it seems clear now that it is not (at my standpoint, and at yours). Totally agree that Berlin is both a named entity and a concept. It is part of the gray area between concepts and named entities. Take care, Fred uFrederick Giasson wrote:" "Extraction Framework does not work - DBpedia server down?" "uHi, I have been trying to run the extraction framework in order to update the mappingsbased_properties_de.nq file and other files for two days. Unfortunately the program aborts the process and instead produces some error messages: [WARNING] Could not transfer metadata org.dbpedia.extraction:core:2.0-SNAPSHOT/maven-metadata.xml from/to maven.aksw.internal ( transferring file: Connection refused [WARNING] Could not transfer metadata org.dbpedia.extraction:core:2.0-SNAPSHOT/maven-metadata.xml from/to maven.aksw.snapshots ( transferring file: Connection refused [WARNING] Could not transfer metadata org.dbpedia.extraction:core:2.0-SNAPSHOT/maven-metadata.xml from/to maven.aksw.internal ( transferring file: Connection refused [WARNING] Could not transfer metadata org.dbpedia.extraction:core:2.0-SNAPSHOT/maven-metadata.xml from/to maven.aksw.snapshots ( transferring file: Connection refused [WARNING] Could not transfer metadata org.dbpedia.extraction:main:2.0-SNAPSHOT/maven-metadata.xml from/to maven.aksw.internal ( transferring file: Connection refused [WARNING] Could not transfer metadata org.dbpedia.extraction:main:2.0-SNAPSHOT/maven-metadata.xml from/to maven.aksw.snapshots ( transferring file: Connection refused Does anybody have a clue about when the dbpedia server will be accessible again? Cheers, Bastian uHi Bastian, DBpedia download server is back to work a couple of days ago. There was a problem with air conditioning and it's now over." "AboutThisDay.com - a 'this day in history' style search engine powered by DBPedia" "uhi All, I am happy to announce the recent beta launch of www.AboutThisDay.com - a 'this day in history' style search engine of facts relating to people, events and holidays. This website is powered by the DBPedia and so a very massive thanks to all the developers and the sponsors behind DBPedia. Please do check this website out and please let us know of any comments/suggestions you may have to via our facebook/twitter pages. What has been released now is just an initial basic set of functionality and we have some very interesting features to come. For those who need a bit more background on this, below is a summary of the features that we believe makes our website stand out from the rest: 1. A slick, contemporary, user-friendly and highly responsive user interface 2. Date range filters to narrow down the results to specific set/range of years 3. Category filters that allows you to see politicians, sports events etc. happened on this day 4. Results sorted by popularity based on a home-grown ranking algorithm 5. Real-time updates from Wikipedia using the DBpedia live updates feature Thanks, Kavi hi All, I am happy to announce the recent beta launch of www.AboutThisDay.com - a 'this day in history' style search engine of facts relating to people, events and holidays. This website is powered by the DBPedia and so a very massive thanks to all the developers and the sponsors behind DBPedia. Please do check this website out and please let us know of any comments/suggestions you may have to via our facebook/twitter pages. What has been released now is just an initial basic set of functionality and we have some very interesting features to come. For those who need a bit more background on this, below is a summary of the features that we believe makes our website stand out from the rest: 1. A slick, contemporary, user-friendly and highly responsive user interface 2. Date range filters to narrow down the results to specific set/range of years 3. Category filters that allows you to see politicians, sports events etc. happened on this day 4. Results sorted by popularity based on a home-grown ranking algorithm 5. Real-time updates from Wikipedia using the DBpedia live updates feature Thanks, Kavi uHi Kavi, I can't seem to find the attribution to DBpedia (although I do see the attribution to Wikipedia). Can you please help me find it? I am also missing links from the entities (Amy Winehouse, etc.) to their DBpedia pages. On the Web, links are currency. Linking to the DBpedia identifiers is a great way to show your appreciation for the work done in this project. Cheers, Pablo On Fri, Sep 14, 2012 at 5:06 PM, AboutThisDay < > wrote: uhi Pablo, Good spot :) and many thanks for the feedback. We will very soon be adding some kind of an 'About Us' page where we were planning to properly credit/attribute DBpedia. But since you have mentioned it, we have now added attribution to DBpedia in both the preview panel (right at the bottom where you found the Wikipedia attribution) and at the footer. Thanks again, Kavi www.aboutthisday.com - an 'on this day' style search engine On Fri, Sep 14, 2012 at 5:22 PM, Pablo N. Mendes < >wrote: u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uOn 9/14/12 5:42 PM, Kingsley Idehen wrote: And I just noticed that you are exposing DBpedia URIs when an item is selected. Example: does expose (via @href) in the blurb section e.g.: From the Wikipedia article Heinz Gerischer retrieved via DBpedia , released under the CC BY-SA 3.0 license. To make it clearer just change \"DBpedia\" to \"DBpedia URI\" or \"DBpedia ID\" and you are very much set re. virtuous and beneficial attribution, Linked Data style! uhi Kingsley, Thanks very much for taking the time to review this site and for all your suggestions. selected>> To make it clearer just change \"DBpedia\" to \"DBpedia URI\" or \"DBpedia ID\" and you are very much set re. virtuous and beneficial attribution, Linked Data style! I actually read your recent email to this mailing list about the 'best practices' reg. DBPedia attribution and went with the @href style as it best suited our UI design principles and performance guidelines. I personally think that the information in your mail would be very useful to other DBpedia users if published in the DBpedia website (similar to BTW, the \"DBpedia ID\" change has been done. Thanks again, Kavi u0€ *†H†÷  €0€1 0 + uOn 9/14/12 7:20 PM, AboutThisDay wrote: One more thing, I just noticed: emulating poor URI patterns from the likes Google and Twitter. Are you able to make Web-scale permalinks for your Web document?" "Announcing: Search DBpedia.org" "uHi all, exploring the DBpedia dataset and its linked data wasn't really easy and intuitive to developers and end users. To solve this problem was my primary object when I started development of a combined search application and data browser. The first prototype is now publicly available at It features full text search, faceted browsing, result classification and much more. Being a first prototype, there are some open issues and performance limitations, but I hope you get the idea of bringing together usability and structured data. I've posted a short introduction of the project to my blog at [1] I highly appreciate feedback and ideas for further improvements. Cheers Georgi [1] uHi all, unfortunately there is an issue with our Apache web server, so you might get a 502 \"bad gateway\". To fix that I have to update Apache to latest version, which can't be done immediately. I hope to get this fixed soon. Georgi" "how to parse infobox prop like dbpedia?" "uhi guys, Recently I have been trying to do some data mining on wikipedia. I wish to parse infobox properties in pair. However, I meet a problem that when I want to write a homepage parser, I find there are labels in different infobox mean the same concept, homepage. I got confused and reference your project. DBpedia has a very good performance on this. It's impressive. So I wish to know how can you *get infobox templates from wiki*? I mean the dumps of all. I searched wiki and didn't get satisfying answer. In addition, a small question that how can you map different homepage labels to one dbpedia ontology label? I checked \"Mapping_en.xml\" file in your project. Is the mapping process done manually, or automatically? Thank you very much in advance! hi guys, Recently I have been trying to do some data mining on wikipedia. I wish to parse infobox properties in pair. However, I meet a problem that when I want to write a homepage parser, I find there are labels in different infobox mean the same concept, homepage. I got confused and reference your project. DBpedia has a very good performance on this. It's impressive. So I wish to know how can you get infobox templates from wiki ? I mean the dumps of all. I searched wiki and didn't get satisfying answer. In addition, a small question that how can you map different homepage labels to one dbpedia ontology label? I checked 'Mapping_en.xml' file in your project. Is the mapping process done manually, or automatically? Thank you very much in advance!" ""Official" dbpedia Name?" "uI'm ready to pull the trigger on a (too?) long write-up on *dbpedia* and the Linking Open Data initiative. (I hope all will like it :) ; I am trying to drum up some attention!) I'm just awaiting the signal from Kingsley that the OpenLink Software folks have got everything in a state suitable for a new demo release, and any URL updates if needed. However, I have also noted many different forms for how to write \"dbpedia\", including: *dbpedia* dbpedia DBpedia DBPedia Taking the lead from the Web site, I have used the first variant. But, if not correct, please let me know. Also, I would suggest whatever canonical form is adopted, it truly become the canon. I think many will be talking about this for some time, and it would be helpful to be uniform. Mike uMike, We are ready :-) Very good point! I tend to use DBpedia (with the capitalized DB for effect and emphasis). Others tend to use dbpedia :-) I would use \"dbpedia\" for now since that's the most common use. Others: I think we need to decide the best branding moving forward though :-) Certainly. Kingsley" "Bls: @devs: Please add Indonesian namespace" "uHi Pablo, Jona, Dimitris, and Hamza Thank you for responding my request. I hope I can try mapping for Indonesian nest week. Cheers, Riko Dari: Dimitris Kontokostas < > Kepada: Hamza Asad < > Cc: dbpedia-discussion < > Dikirim: Jumat, 1 Maret 2013 13:54 Judul: Re: [Dbpedia-discussion] @devs: Please add Indonesian namespace Hi Hamza, I 'll try to have this ready by the beginning of next week. I'll let you know how it goes Best, Dimitris On Fri, Mar 1, 2013 at 7:45 AM, Hamza Asad < > wrote: Urdu Means Roman Urdu (Which is written in english format). Its Humble Request uHi Dimitris and All, Thank you, Now, I will start creating mappings for Indonesian.   Regards, Riko Dari: Dimitris Kontokostas < > Kepada: riko adi prasetya < > Cc: Hamza Asad < >; \" \" < >; Jona Sahnwaldt < >; dbpedia-discussion < > Dikirim: Selasa, 5 Maret 2013 14:46 Judul: Re: [Dbpedia-discussion] @devs: Please add Indonesian namespace Hi Riko, You can start creating mappings for Indonesian and Urdu :-) There are no statistics for now but we plan to add them shortly Best, Dimitris On Fri, Mar 1, 2013 at 2:04 PM, riko adi prasetya < > wrote: Hi Pablo, Jona, Dimitris, and Hamza uHi Dimitris, There are no statistics for id[1]. Can you help me to add statistics for id? I'm really need your help. [1]    Regards, Riko Dari: Dimitris Kontokostas < > Kepada: riko adi prasetya < > Cc: Hamza Asad < >; \" \" < >; Jona Sahnwaldt < >; dbpedia-discussion < > Dikirim: Selasa, 5 Maret 2013 14:46 Judul: Re: [Dbpedia-discussion] @devs: Please add Indonesian namespace Hi Riko, You can start creating mappings for Indonesian and Urdu :-) There are no statistics for now but we plan to add them shortly Best, Dimitris On Fri, Mar 1, 2013 at 2:04 PM, riko adi prasetya < > wrote: Hi Pablo, Jona, Dimitris, and Hamza" "Wiktionary DBpedia - Supported Data" "uHi, I'm quite new to the world of DBpedia and SPARQL endpoints, so please feel free to point me in the right direction if I could find the answer to those questions elsewhere (Google, StackOverflow and SemanticWeb didn't turn any result.) I've got two simple questions on Wiktionary.DBpedia. 1) ENGLISH VS. OTHER SUPPORTED LANGUAGES The Wiktionary DBpedia project page states the currently available languages as \"English, German, French, Russian\". However, the lexical entries consistently only point to dbpedia-en-* resources, here are a few examples: ie. : no dbpedia-ru-*, dbpedia-fr-* anywhere in there. Which makes it look like only the English Wiktionary pages are actually connected to the project. What am I missing here and how can I check the extent of support for each of the currently available languages? My goal is to extract data from ru.wiktionary pages, through DBpedia. 2) SPECIFICATIONS ON AVAILABLE WIKTIONARY DATA Is all of the data from a Wiktionary page supposed to be available through Wiktionary.DBpedia? For example, I can see the definition for the word машина in Russian here machine 2. engine 3. mechanism\", etc.) but not there Where can I find a list of Wiktionary sections that are officially supported by the Wiktionary DBpedia project? Also, where can I find the age of the data available on Wiktionary.DBpedia? Thank you for the help. I look forward to using and contributing to the project. Sincerely, Fabien Hi, I'm quite new to the world of DBpedia and SPARQL endpoints, so please feel free to point me in the right direction if I could find the answer to those questions elsewhere (Google, StackOverflow and SemanticWeb didn't turn any result.) I've got two simple questions on Wiktionary.DBpedia. 1) ENGLISH VS. OTHER SUPPORTED LANGUAGES The Wiktionary DBpedia project page states the currently available languages as 'English, German, French, Russian'. However, the lexical entries consistently only point to dbpedia-en-* resources, here are a few examples: Fabien uAm 13.06.2013 23:21, schrieb Fabien Snauwaert: I am not sure, what you mean. from the Russian manger page: Of course there are holes. The dump is a little bit outdated. You could rerun them here: Actually, I should update some of the documentation and do a release today. Any data or improvements you are gladly accepted. All the best, Sebastian uHello Sebastian, Thanks for the help. Please see my comments/questions below. On Fri, Jun 14, 2013 at 8:09 AM, Sebastian Hellmann < > wrote: My point is that this query returns results from en.wiktionary data (judging from the \"@en\" tag in the results) meanwhile I would expect them to come from ru.wiktionary instead (because of the \"@ru\" tag in the query): # EXTRACT THE PRONUNCIATION OF A RUSSIAN WORD - Mind the @ru PREFIX dc: PREFIX rdfs: PREFIX wt: SELECT DISTINCT ?spell ?pronounce WHERE { ?spell rdfs:label \"manger\"@ru; wt:hasLangUsage ?use . ?use dc:language wt:French; wt:hasPronunciation ?pronounce . } # Results: Mind the @en # spell pronounce # # # It may seem trivial on such an example, but on the below query, it is quite significant. Trying to extract the pronunciation (IPA transcription) of a word from ru.wikpedia rather than en.wikipedia (because en.wikpedia misses results that ru.wikipedia does have): # EXTRACT THE PRONUNCIATION OF A RUSSIAN WORD PREFIX dc: PREFIX rdfs: PREFIX wt: SELECT DISTINCT ?spell ?pronounce WHERE { ?spell rdfs:label \"бы\"@ru; wt:hasLangUsage ?use . ?use dc:language wt:Russian; wt:hasPronunciation ?pronounce . } # Returns an empty result # Meanwhile en.wikipedia does not have the pronunciation (IPA) for that word http://en.wiktionary.org/wiki/%D0%B1%D1%8B # While, ru.wikipedia does have it http://ru.wiktionary.org/wiki/%D0%B1%D1%8B Could you please clarify if en.wiktionary data has a special role inside of DBpedia as opposed to other languages? (or explain the above results to me) Thanks!" "local dbpedia dump load to virtuoso server" "uHi, I am loading dbpedia version 3.8 into Virtuoso 6.1. I unzipped the files into .gz format and ran ld_dir method in isql mode to load all files in a folder (whole dump). Then I am running rdf_loader_run() to process the files. I have two questions, 1. the server is having 32 GB of ram and 8 cores. But still it has taken 2 days to load half the files. Why is it taking this much time? is it normal? 2. When I check the status using SELECT * FROM DB.DBA.LOAD_LIST; one file has a start timestamp but no end timestamp. Following is the relevant row for that file in the output, /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/infobox_properties_en.nt.gz If I re-load that file only after the other files are processed, will it add duplicate tuples or I have to remove that file and then load that file only? If so, what is the process to do that? If I commit after processing all files are finished (I assume I have to commit using command commit WORK; to make changes permanent) and only load that infobox_properties_en.nt.gz file only again, is it fine? Please let me know some details about these two problems and make the local dbpedia server working without any errors. Thank you. regards, kalpa P {margin-top:0;margin-bottom:0;} Hi, I am loading dbpedia version 3.8 into Virtuoso 6.1. I unzipped the files into .gz format and ran ld_dir method in isql mode to load all files in a folder (whole dump). Then I am running rdf_loader_run() to process the files. I have two questions, 1. the server is having 32 GB of ram and 8 cores. But still it has taken 2 days to load half the files. Why is it taking this much time? is it normal? 2. When I check the status using SELECT * FROM DB.DBA.LOAD_LIST; one file has a start timestamp but no end timestamp. Following is the relevant row for that file in the output, /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/infobox_properties_en.nt.gz you. regards, kalpa u0€ *†H†÷  €0€1 0 + uHi Hugh, Yes, I uncommented the lines and commented out the default number of buffers before I started the server. I made the changes in virtuoso.ini in var/lib/db folder. But following is the status I see when I type status(''); in isql. I think it is still using 2000 buffers and I do not know why it is using the default setting after I have changed the parameters. OpenLink Virtuoso Server Version 06.01.3127-pthreads for Linux as of Nov 15 2012 Started on: 2012/11/18 16:10 GMT-300 Database Status: File size 29255270400, 3571200 pages, 760183 free. 2000 buffers, 2000 used, 1175 dirty 6 wired down, repl age 429 441 w. io 3 w/crsr. Disk Usage: 352003762 reads avg 0 msec, 0% r 0% w last 0 s, 198597618 writes, 1311 read ahead, batch = 1. Autocompact 4406099 in 3148871 out, 28% saved. Gate: 3381 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap. Log = /home/kalpa/Virtuoso/var/lib/virtuoso/db/virtuoso.trx, 2394315279 bytes 1545766 pages have been changed since last backup (in checkpoint state) Current backup timestamp: 0x0000-0x00-0x00 Last backup date: unknown Clients: 3 connects, max 2 concurrent RPC: 37 calls, 1 pending, 1 max until now, 0 queued, 0 burst reads (0%), 0 second brk=159555584 Checkpoint Remap 339966 pages, 0 mapped back. 1997 s atomic time. DB master 3571200 total 760183 free 339966 remap 44104 mapped back temp 256 total 250 free Lock Status: 0 deadlocks of which 0 2r1w, 105 waits, Currently 3 threads running 0 threads waiting 0 threads in vdb. Pending: 23 Rows. u0€ *†H†÷  €0€1 0 + uHi Hugh, There was a bug in the script (ini file when uncommenting and had to remove spaces). I think it is working fine now and loaded within few hours and I can see its memory usage using top command and isql status(''); message for number of buffers. I have a another issue though, I downloaded all the files for DBpedia 3.8 dump but when I query an instance, it doesn't give me interconnections (sameAs link) for other datasets like Freebase but only sameAs links within dbpedia. Following is the list of loaded files (which was downloaded from dump and loaded). There if the folder, I have created a file called global.graph as mentioned in the openLink website documentations but it didn't get loaded. I can also see that external interlining file is also loaded but I do not get sameAs links to Freebase. I checked this by running the same query in dbpedia.org sparql endpoint and my local endpoint. Do you have any idea what would have happened if anything went wrong in my installation? Thank you very much for the help. /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/article_categories_en.nt.gz /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/category_labels_en.nt.gz /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/disambiguations_en.nt.gz /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/disambiguations_unredirected_en.nt.gz /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/external_links_en.nt.gz /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/geo_coordinates_en.nt.gz /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/global.graph /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/homepages_en.nt.gz /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/images_en.nt.gz http://dbpedia.org 2 2012.11.21 22:7.33 0 2012.11.21 22:12.56 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/infobox_properties_en.nt.gz http://dbpedia.org 2 2012.11.21 22:12.56 0 2012.11.21 23:10.20 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/infobox_properties_unredirected_en.nt.gz http://dbpedia.org 2 2012.11.21 23:10.20 0 2012.11.21 23:37.1 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/infobox_property_definitions_en.nt.gz http://dbpedia.org 2 2012.11.21 23:37.1 0 2012.11.21 23:37.6 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/infobox_test_en.nt.gz http://dbpedia.org 2 2012.11.21 23:37.1 0 2012.11.22 0:28.20 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/instance_types_en.nt.gz http://dbpedia.org 2 2012.11.22 0:28.20 0 2012.11.22 0:36.7 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/interlanguage_links_en.nt.gz http://dbpedia.org 2 2012.11.22 0:36.7 0 2012.11.22 0:48.43 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/interlanguage_links_same_as_chapters_en.nt.gz http://dbpedia.org 2 2012.11.22 0:48.43 0 2012.11.22 0:55.18 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/interlanguage_links_same_as_en.nt.gz http://dbpedia.org 2 2012.11.22 0:55.19 0 2012.11.22 1:8.38 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/interlanguage_links_see_also_chapters_en.nt.gz http://dbpedia.org 2 2012.11.22 1:8.39 0 2012.11.22 1:9.6 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/interlanguage_links_see_also_en.nt.gz http://dbpedia.org 2 2012.11.22 1:9.6 0 2012.11.22 1:9.37 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/iri_same_as_uri_en.nt.gz http://dbpedia.org 2 2012.11.22 1:9.37 0 2012.11.22 1:10.41 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/labels_en.nt.gz http://dbpedia.org 2 2012.11.22 1:10.41 0 2012.11.22 1:23.53 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/long_abstracts_en.nt.gz http://dbpedia.org 2 2012.11.22 1:23.53 0 2012.11.22 1:36.56 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/mappingbased_properties_en.nt.gz http://dbpedia.org 2 2012.11.22 1:36.56 0 2012.11.22 1:59.15 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/mappingbased_properties_unredirected_en.nt.gz http://dbpedia.org 2 2012.11.22 1:59.15 0 2012.11.22 2:7.58 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/page_ids_en.nt.gz http://dbpedia.org 2 2012.11.22 2:7.58 0 2012.11.22 2:23.26 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/page_links_en.nt.gz http://dbpedia.org 2 2012.11.22 2:23.26 0 2012.11.22 4:59.20 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/page_links_unredirected_en.nt.gz http://dbpedia.org 2 2012.11.22 4:59.20 0 2012.11.22 6:15.17 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/persondata_en.nt.gz http://dbpedia.org 2 2012.11.22 6:15.17 0 2012.11.22 6:28.46 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/persondata_unredirected_en.nt.gz http://dbpedia.org 2 2012.11.22 6:28.46 0 2012.11.22 6:32.14 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/pnd_en.nt.gz http://dbpedia.org 2 2012.11.22 6:32.14 0 2012.11.22 6:32.14 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/redirects_en.nt.gz http://dbpedia.org 2 2012.11.22 6:32.14 0 2012.11.22 6:40.57 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/redirects_transitive_en.nt.gz http://dbpedia.org 2 2012.11.22 6:40.58 0 2012.11.22 6:45.28 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/revision_ids_en.nt.gz http://dbpedia.org 2 2012.11.22 6:45.28 0 2012.11.22 7:10.23 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/revision_uris_en.nt.gz http://dbpedia.org 2 2012.11.22 7:10.23 0 2012.11.22 7:46.7 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/short_abstracts_en.nt.gz http://dbpedia.org 2 2012.11.22 7:46.7 0 2012.11.22 8:1.54 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/skos_categories_en.nt.gz http://dbpedia.org 2 2012.11.22 8:1.54 0 2012.11.22 8:8.2 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/specific_mappingbased_properties_en.nt.gz http://dbpedia.org 2 2012.11.22 8:8.2 0 2012.11.22 8:9.19 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/topical_concepts_en.nt.gz http://dbpedia.org 2 2012.11.22 8:9.19 0 2012.11.22 8:9.28 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/topical_concepts_unredirected_en.nt.gz http://dbpedia.org 2 2012.11.22 8:9.19 0 2012.11.22 8:9.30 0 0 NULL NULL /home/kalpa/Virtuoso/data/datasets/dbpedia/3.8/3.8/en/wikipedia_links_en.nt.gz http://dbpedia.org 2 2012.11.22 8:9.30 0 2012.11.22 8:50.6 0 0 NULL NULL From: Hugh Williams [ ] Sent: Thursday, November 22, 2012 6:10 AM To: Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva Cc: Subject: Re: [Dbpedia-discussion] local dbpedia dump load to virtuoso server Hi Kalpa, Did you restart your Virtuoso instance having made these changes to enable the new settings to take effect ? Also please provide a copy of the INI file changes you have made so we can see what should be reported in the status output ? Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog uOn Tue, Nov 27, 2012 at 7:30 PM, Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva < > wrote: Also load the files in Regards, Christopher u0€ *†H†÷  €0€1 0 + uOn Wed, Nov 28, 2012 at 2:29 AM, Hugh Williams < > wrote: uOn Wed, Nov 28, 2012 at 4:41 PM, Jona Christopher Sahnwaldt < > wrote: Preview: uOn Wed, Nov 28, 2012 at 10:42 AM, Jona Christopher Sahnwaldt < > wrote: Note also that if you need a fresher copy (for example to use with DBpedia Live), it's trivial to generate from the weekly Freebase dump. You can either use the Scala program: or a very light postprocessing of the output of this command: bzgrep $'/type/object/key\t/wikipedia/en\t' freebase-datadump-quadruples.tsv.bz2 | cut -f 4,1 Tom On Wed, Nov 28, 2012 at 10:42 AM, Jona Christopher Sahnwaldt < > wrote: On Wed, Nov 28, 2012 at 4:41 PM, Jona Christopher Sahnwaldt < > wrote: > On Wed, Nov 28, 2012 at 2:29 AM, Hugh Williams < hwilliams@openlinksw.com > wrote: >> Hi Christopher, >> >> I am not aware of a linkset to Freebase in the DBpedia project. > > Tom uThank you all for the help. I see that \"links\" folder is outside of the \"en\" folder in the dump. I thought all the related files for English dump should be in en folder. It seems like I need to load specifically links folder to get links to other datasets. But interesting to know what is inside \"external_links\" and \"iri_same_as_uri_en\" files in the 3.8/en folder. I thought they are the correct links for other datasets. Any idea? Thank you again for the help. From: Tom Morris [ ] Sent: Wednesday, November 28, 2012 12:03 PM To: Jona Christopher Sahnwaldt Cc: Subject: Re: [Dbpedia-discussion] local dbpedia dump load to virtuoso server On Wed, Nov 28, 2012 at 10:42 AM, Jona Christopher Sahnwaldt < > wrote: On Wed, Nov 28, 2012 at 4:41 PM, Jona Christopher Sahnwaldt < > wrote: Preview: Note also that if you need a fresher copy (for example to use with DBpedia Live), it's trivial to generate from the weekly Freebase dump. You can either use the Scala program: or a very light postprocessing of the output of this command: bzgrep $'/type/object/key\t/wikipedia/en\t' freebase-datadump-quadruples.tsv.bz2 | cut -f 4,1 Tom P {margin-top:0;margin-bottom:0;} Thank you all for the help. I see that 'links' folder is outside of the 'en' folder in the dump. I thought all the related files for English dump should be in en folder. It seems like I need to load specifically links folder to get links to other datasets. But interesting to know what is inside 'external_links' and 'iri_same_as_uri_en' files in the 3.8/en folder. I thought they are the correct links for other datasets. Any idea? Thank you again for the help. From: Tom Morris [tfmorris@gmail.com] Sent: Wednesday, November 28, 2012 12:03 PM To: Jona Christopher Sahnwaldt Cc: dbpedia-discussion@lists.sourceforge.net Subject: Re: [Dbpedia-discussion] local dbpedia dump load to virtuoso server On Wed, Nov 28, 2012 at 10:42 AM, Jona Christopher Sahnwaldt < jc@sahnwaldt.de > wrote: On Wed, Nov 28, 2012 at 4:41 PM, Jona Christopher Sahnwaldt < jc@sahnwaldt.de > wrote: > On Wed, Nov 28, 2012 at 2:29 AM, Hugh Williams < hwilliams@openlinksw.com > wrote: >> Hi Christopher, >> >> I am not aware of a linkset to Freebase in the DBpedia project. > > Tom" "Dump extract of all company informations" "uDear all, What is the simplest way to get a row extract of all company entries that are now in dbpedia ? We are building a predictive company size model and we would like to use dbpedia as training set. Is there somewhere some statistics about the coverage (average sizes, company types, ) of company information within Wikipedia ? Many thanks for you help. Kind regards Antoine uHi Antoine, On 4/14/14, 9:45 AM, Antoine Logean wrote: I assume you are referring to the international DBpedia, i.e. An intuitive approach is to run this query [1] over the SPARQL endpoint [2], page through its results with OFFSET and LIMIT, and run a describe query (e.g. [3]) on each of the results. This mapping statistics page [4] may help you, it contains information on the actual usage of the Wikipedia template mapped to Company [5]. Cheers! [1] select ?s where { ?s a dbpedia-owl:Company } limit 1000 [2] [3] describe [4] [5] > Many thanks for you help. uDear Antoine, The easier way is use RDFSlice. RDFSlice allow you to extract the relevant fragment from DBpedia dump files over streaming. You can read more about it here rdfslice.aksw.org. best regards, Edgard On Mon, Apr 14, 2014 at 9:45 AM, Antoine Logean < >wrote: uDear Antoine, the DBpedia as tables site should provide what you are looking for: Best, Heiko Zitat von Edgard Marx < >:" "How xmls: and dbpedia: are related ?" "uHello, I try to understand how \" and \" are related I brows dbpedia with the sparql endpoints but I cannot find any relation between them (like a 'sameAs' relation for example). Did someone will be kind to enlight me ? (I am a beginer so I search for basic explanation) Christophe." "How do I consistently query dbpedia for programming languages by name?" "uThere seems to be no consistent way to query for programming languages based on name. Examples: rdfs:label \"D (programming language)\"@en dbpprop:name \"D programming language\" owl:sameAs freebase:\"D (programming language)\" foaf:name \"D programming language\" vs. rdfs:label \"C++\"@en dbpprop:name \"C++\" owl:samwAs freebase:\"C++\" foaf:name \"C++\" Since there's no standard convention for whether \"programming language\", \"(programming language)\", \"programming_language\", \"(programming_language\", or \"\" is part of a name for a programming language in dbpedia, I have no idea how to consistently search by name. I'd like to create some sort of SPARQL query that returns Unless at least one of the various triples for programming languages uses a consistent naming convention, I'll have to hack it by querying first against name + \" (programming_language)\", and falling back to name + \"(programming language\", name + \" programming language\" when no results are found. But I'd like a much more robust method. uHi Andrew, On 10/26/2012 11:13 PM, Andrew Pennebaker wrote: You can use regex to match it. So for D programming language you can use: SELECT * WHERE { ?language rdf:type dbpedia-owl:ProgrammingLanguage. ?language rdfs:label ?lbl. filter(regex(?lbl, '^D+', 'i') ) } and for C++ you can use: SELECT * WHERE { ?language rdf:type dbpedia-owl:ProgrammingLanguage. ?language rdfs:label ?lbl. filter(regex(?lbl, '^C\\+\\++', 'i') ) } uHere is one way you could generate a list of programming languages Look at the bottom of and you see categories like “C Programming Language Family” and then if you look at you’ll see that is a member of category by traversing this graph you can find categories that contain programming languages and programming languages. All of the category links are in DBpedia so this is straightforward to do. The great thing is you can seed this with a query that gets partial results; for instance, you can use your search for “programming language” in the name. To be fair you’ll need to put some human effort into this. You’ll find some categories that turn up that are wrong, and probably get some items like “Generics in Java” and “Dennis Richie”. Still my experience is that I can create categories of 10,000 or so things (like “things in new york city that don’t have coordinates” or “things related to sex and drugs”) in a few hours of work. It’s helpful to sort results with a subjective importance score so at least you can see the worst outliers. (At one point I got Hillary Clinton as the top “sex” topic, for instance, because she was the victim of adultery. It’s quite interesting that the perpetrator of adultery didn’t get flagged) The graph traversal has a similar structure to Kleinberg’s hubs and authorities algorithm and there’s probably some way to assign scores to the nodes that are related to probability of a topic or category being in the set. Note also that Freebase has a programming language type, see and you could get a list of programming languages there and then map the id’s back to DBpedia. Here is one way you could generate a list of programming languages Look at the bottom of look at languages and programming languages. All of the category links are in DBpedia so this is straightforward to do. The great thing is you can seed this with a query that gets partial results; for instance, you can use your search for “programming language” in the name. To be fair you’ll need to put some human effort into this. You’ll find some categories that turn up that are wrong, and probably get some items like “Generics in Java” and “Dennis Richie”. Still my experience is that I can create categories of 10,000 or so things (like “things in new york city that don’t have coordinates” or “things related to sex and drugs”) in a few hours of work. It’s helpful to sort results with a subjective importance score so at least you can see the worst outliers. (At one point I got Hillary Clinton as the top “sex” topic, for instance, because she was the victim of adultery. It’s quite interesting that the perpetrator of adultery didn’t get flagged) The graph traversal has a similar structure to Kleinberg’s hubs and authorities algorithm and there’s probably some way to assign scores to the nodes that are related to probability of a topic or category being in the set. Note also that Freebase has a programming language type, see id’s back to DBpedia. uOn 27/10/12 10:13, Andrew Pennebaker wrote: [snip] Out of interest: you (at least I assume it was you) posted this exact same question to StackOverflow a couple of days ago. A possible answer was provided there as well, but you seem to not have followed up on it there. Then I notice you also posted this mail message to two separate mailinglists at the same time. Any particular reason you are using three separate channels to ask this question, rather than just picking one and giving people some time to respond? Regards, Jeen" "extraction problem" "uHi All, Greeting for the day I want to extract infobox properties and abstract from (pages-articles.xml.bz2).I am able to download this file using command \"/run download config=download.de.properties\" here I have configured file download.de.properties.file to download only german page-article file. Now when i am trying to extract information out from it using \"/run extraction extraction.de.property\" it is giving me below error. In *extraction.de.property *I have mentioned dir properly , the same which I have mentioned in download.de.properties file. Please let me know what wrong is going on?Is there any change need to be done in pom.xml of cump dir. \" [INFO] uHi All, I want to extract Abstract from page_article dump using dbpedia_extracter . But some of the pages there are no proper abstract. Like some says redirecting to some other page. or some have some other non-required information. Is there any possibility to get cleaner abstract. After analyzing I come across below. if tag has #REDIRECT|#redirect than those are redirected pages. If anyone has some other ideas than please suggest me. Thanks On Tue, Mar 5, 2013 at 11:36 AM, riko adi prasetya < >wrote: uHi Gaurav, On 03/05/2013 09:51 AM, gaurav pant wrote: The abstract extractor requires special handling to work properly. You can find details about how to get it working here [1]. [1] abstractExtraction uHi All, @Morsey" "Discovery_Communications missing in infobox dump" "uIn infobox_properties_en.nt.bz2, is missing (i.e. there exists no triple where that is the first element). This is very strange because many entries in the dump (such as Discovery_Channel, Discovery_People, Discovery_Space, etc) reference Discovery_Communications in the last element of the triple. Any ideas why? Thanks! uHi, At the time of the extraction the infobox at the corresponding wiki page had a line with a strange syntax: | [[type]] = [[Public]] See the wiki page history, May 2014. This had caused the extraction error. In the current version of the page the issue is fixed, and so the extraction works fine (you can check this at [1]) - but the result is obviously not in the dump. Cheers, Volha [1] On 1/9/2015 5:18 PM, Chris Uga wrote: uOn Mon, Jan 12, 2015 at 6:02 AM, Volha Bryl < > wrote: But surely an infobox parse error shouldn't cause the entire object to be dropped, right? At least some basics like the rdfs:label, owl:sameAs, etc could still be populated rather than leaving a disconnected piece of the graph. Tom On Mon, Jan 12, 2015 at 6:02 AM, Volha Bryl < > wrote: At the time of the extraction the infobox at the corresponding wiki page had a line with a strange syntax: | [[type]] = [[Public]] See the wiki page history, May 2014. This had caused the extraction error. But surely an infobox parse error shouldn't cause the entire object to be dropped, right?  At least some basics like the rdfs:label, owl:sameAs, etc could still be populated rather than leaving a disconnected piece of the graph. Tom  uHi Tom, the entity isn't dropped at all - it just does not appear in the infobox properties file as a subject. If you look at you will find that there is a label, sameAs links, types, categories, etc. - they are just contained in different files for download. Best, Heiko Am 12.01.2015 um 18:55 schrieb Tom Morris:" "Developing new Extractors for DBpedia was: SKOS, Eponymous Categories, and Main Articles" "uHi Matt, we are currently working on other things (Live extraction, abstracts), but I whipped up some notes for people who want to have extra data extracted. Please see here: [1] Development and here [2] When I read your mail, I realized that it is actually quite easy to implement new Extractors, as you basically just have to copy and implement a new PHP class. So please consider either one, filing a feature request or implementing the class yourself see [1] for details, Regards, Sebastian Hellmann, AKSW PS: please mail me your problems, if any, so I can update documentation. btw.: your extractor would probably make a fine chapter in your thesis [1] [2] Matt Mullins schrieb:" "A Ph.D. Student’s CRY for Help" "uA Ph.D. Student’s CRY for Help To: Prof. Hamid Arabnia, WORLDCOMP Coordinator, Professor of Computer Science, University of Georgia, USA Prof. Hamid Arabnia, I am a student from Africa and I am in the final stages of my Ph.D. work. I have a journal paper in ACM and a conference paper in WORLDCOMP. As per my university policy, I am required to have at least two research papers in peer-reviewed (refereed) international conferences or journals before I submit my synopsis/dissertation. While everything was fine, recently, WORLDCOMP was declared as a bogus conference with evidences and endorsements from scientists: Now, my university officials have stopped me from submitting the dissertation stating that WORLDCOMP is not peer-reviewed due to the evidences in these websites. They said that WORLDCOMP is completely fake and that’s why you have not responded to the open challenge at om others. Now I am shocked with your silence. I am clueless on what to do as I will forfeit my student status unless I submit the dissertation soon. I am facing humiliation here. I was attracted by the keynote speakers, sponsors, tutorials, and University of Georgia name at WORLDCOMP website. Now I feel that I made a very critical mistake by submitting to WORLDCOMP. As the last resort, I am posting to this mailing list/forum (where WORLDCOMP details were published in the past I think) so that you understand the seriousness. I once again ask you to provide a detailed response at WORLDCOMP’s website and prove that it is not a fake (and also email me my paper reviews). I request you to focus on this issue than organizing the next conference (and create more victims like me). I hope you still have moral values. My elderly parents are dependent on me and I am crying for help. I openly beg your response! Respectfully, Saidi (this is my nickname and I am using it to avoid further humiliation. I know that you will easily identify me from this nickname and from my background mentioned above). To the forum/list owner: please understand my situation and publish this message and it will help other researchers to be more careful while choosing a conference. To the members: please help me getting a response from Prof. Arabnia." "license update at the Wikimedia Foundation" "uHoi, As expected the Wikimedia Foundation will change its licensing. In the past it was said that DBpedia would follow suit. It is important that it does when you assume that DBpedia needs a license in the first place. As there is little time left in which this change can be made, I urge you all to follow the WMF. Thanks, GerardM ' Hoi, As expected the Wikimedia Foundation will change its licensing. In the past it was said that DBpedia would follow suit. It is important that it does when you assume that DBpedia needs a license in the first place. As there is little time left in which this change can be made, I urge you all to follow the WMF. Thanks,       GerardM ' Result" "What version of wikipedia is the latest dbpedia (2.0) based on?" "uWhat version of wikipedia is the latest dbpedia based on - or what dates did the latest crawl take place? -Chris uChris, On 11 Oct 2007, at 20:20, Chris Welty wrote: It's based on Wikipedia database dumps downloaded in August. Richard" "How to save result of sparql query from dbpedia on a rdf file" "uHi, How can I save result of sparql query from dbpedia on a rdf file? I used dbpedia sparql endpoint () for doing it, but it has the below error : Virtuoso 00000 Error To keep saved SPARQL results, the DAV directory \"(NULL)\" should be of DAV extension type \"DynaRes\" Thanks a lot, Sareh uHi Sareh. The option you are trying to use will save the result of the sparql query in the DAV store on the server side. This would require you to have an account on this machine, which we do not provide at this time. To save the results you can simply execute the query and use the File | Save as option in your browser. Alternatively you can copy the URL for the executed query and use curl or wget on the command line: curl ' Note the single quotes surrounding the url to make sure your shell does not interpret the & sign to start a background process. Patrick OpenLink Software uOn 6/28/11 8:20 AM, sareh aghaei wrote: Feature isn't enabled for DBpedia endpoint. Thus, you need to make a SPARQL protocol URL and then HTTP GET it to your local filesystem via cURL or your Browser or any other HTTP user agent. Kingsley uthanks so much for your reply From: Patrick van Kleef < > To: sareh aghaei < > Cc: Sent: Tue, June 28, 2011 1:31:46 PM Subject: Re: [Dbpedia-discussion] How to save result of sparql query from dbpedia on a rdf file Hi Sareh. The option you are trying to use will save the result of the sparql query in the DAV store on the server side. This would require you to have an account on this machine, which we do not provide at this time. To save the results you can simply execute the query and use the File | Save as option in your browser. Alternatively you can copy the URL for the executed query and use curl or wget on the command line: curl ' Note the single quotes surrounding the url to make sure your shell does not interpret the & sign to start a background process. Patrick OpenLink Software" "how often is dbpedia live updated" "uHi I made a change to a mapping days ago, how soon can I expect to see this change on live? (for example I don't see the fields I added here: Regards George uHi George, sorry about the delayed reply. DBpedia is release approximately twice a year. The data that you see at comes from the current DBpedia release which is version 3.6. The next release is scheduled for August. All the changes that were made in the mappings wiki will be included. However, since DBpedia is all open data and open source, you could also get the extraction framework [1] and extract the data yourself whenever you like. Lastly, DBpedia Live is currently in development and will feature updated data within short periods of time. Cheers, Max [1] On Thu, Apr 7, 2011 at 21:05, George Hamilton < > wrote: uHi George, Actually, DBpedia-Live gets updated every second. In our new DBpedia-Live framework, we take into consideration updates caused by both Wikipedia article change, and updates caused by a mapping change. But changes caused by Wikipedia article change have higher priority, in order for our live-update stream to work smoothly and with no blocks. So it, takes only few minutes to see changes caused by Wikipedia, whether changes caused by mappings may take longer, depending on how many Wikipedia changes are waiting for processing. We have also deployed our new DBpedia-Live framework, and you can test it and send us your feedback about it, and any suggestion will be highly appreciated. You can test our SPARQL endpoint at At the moment there are some Internet connection problems at our university, but they will be solved very soon, so you may experience some inconvenience with our server those days. Later, everything will work fine with you." "Virtuoso/DBpedia VAD missing data from .rdf files" "uHi, I've noticed that the with the RDF/XML description of the resources are missing a big portion of the data displayed on the site. For example look at literals and possibly other information from the .rdf file. The strange thing about this bug is that it seems to be valid only for entities of type place and subclasses of it. Entities of type person or chemical elements seem to be ok, however I haven't checked all of dbpedia and all of the properties so I can't estimate how wide-spread this issue is. Kind Regards, Alexandru Todor uHi, This is something I had to work around a while ago, so I think I know the answer, but correct me if I'm wrong. The RDF XML, JSON and N3 representations return mostly triples in which the resource can be found as an object. Actually, JSON and N3 does not even contain triples where the resource is a subject, while in RDF XML in some cases it does. The problem is that unlike the NTRIPLES, ATOM and JSOD representations, these 3 contain the resource as objects as well and in some cases this leads to lots of triples but there is a limit of 2000 for the number of returned triples. Places as you said are a good example: there are so many triples containing places that it's easy for them to reach 2000 triples while persons do not have as many. If you want to obtain the triples in which the URI is a subject, I would suggest you use the NTRIPLES representation (it also contains language data for the labels unlike rdf xml, json, n3, jsod). Regards, Zoltán On 2011.10.17. 16:09, Alexandru Todor wrote: uHi Zoltán, I have no such problems with DBpedia Germany as you can see by looking at I am pretty sure it is not an issue with Virtuoso itself or the serialization format used but with the DBpedia Vad and some sort of caching mechanism they use for the .rdf files. If you execute the describe query directly you will get the entire dataset and not the truncated one from the .rdf file, for example: Kind Regards, Alexandru On 10/18/2011 09:08 AM, Zoltán Sziládi wrote: uHi Alexandru, Based on what you and I have written, I would guess that when you try to load an .rdf file from dbpedia.org, it creates a query (with an internal redirect) that has a limit of 2000 rows (probably a Virtuoso configuration). I have no other explanationlet's wait for someone official's answer. Kind regards, Zoltán On 2011.10.18. 14:47, Alexandru Todor wrote: uHi Note we are looking into this issue with the Virtuoso DBpedia VAD and will provide an update on this soon … Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 18 Oct 2011, at 14:04, Zoltán Sziládi wrote:" "programmatic queries to DBpedia" "uHi, I'm interested in direct programmatic queries to DBpedia. What are the options? (This is my first posting here. Congratulations to all involved! This is a nice project) Regards, Gustavo Frederico University of Ottawa (Alumni) gcsfred at rogers dot com uGustavo, You can program (using the SPARQL Protocol) against: This is basically how application layer that speaks to the sparql endpoint. uHi Gustavo, here is some sample-code on how to query the dbpedia-Sparql-Endpoint and load the result into a XML-Document (in C#): String SparqlQuery = \"PREFIX rdfs: \" + \"select * \" + \"from \" + \"where { \" + \"?x rdfs:label ?y . \" + \"filter bif:contains (?y, \\"\" + SearchString + \"\\") \" + \"}\"; String querystring = \" 2Fdbpedia.org&query;=\"; querystring += System.Web.HttpUtility.UrlEncode(SparqlQuery); querystring += \"&format;=application%2Fsparql-results%2Bxml\"; HttpWebRequest request = (HttpWebRequest) WebRequest.Create(querystring); HttpWebResponse response = (HttpWebResponse) request.GetResponse(); System.IO.Stream resStream = response.GetResponseStream(); System.Xml.XmlDocument doc = new System.Xml.XmlDocument(); doc.Load(resStream); Hope that helps. Cheers, Georgi" "DBpedia SPARQL endpoint gone down" "uHi All, The Virtuoso DBpedia SPARQL endpoint has gone down. We are working on getting it back online ASAP and shall make an announcement on the mailing list when it is. Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: support uHugh Williams wrote: Important correction: The machine hosting DBpedia is offline due to maintenance, the DB is fine. Kingsley uOn May 7, 2009, at 09:22 AM, Kingsley Idehen wrote: Andthe machine is back online, with double the RAM, so things should generally be more responsive. Inference rules are currently reloading; should be back to normal around Noon, Eastern US time. Be seeing you, Ted" "only articels with english equivalent are processed" "uHi, I'm working on the Hungarian mapping and a realized that only those articles are processed which have English equivalents. Is there a reason for that or it's a bug? Could this behaviour be changed? Bottyán Nemeth uHi :-) Well actually I am not one person of the dbpedia team, but what would make sense, also to improve the souce where dbpedia data comes from, is the creation of a stub on the English wikipedia (translation of the first paragraph normally is a good way to get this) and linking the two. This will improve both projects and since we take, why not also give? I know you would probably like to see this solved just by software, but I see it as a valid workaround which would provide additional value for all. A nice week-end to all, Bina On Sat, Sep 11, 2010 at 11:33 AM, Bottyán Németh < > wrote:" "process dbpedia data" "uHi, I am wondering that if these existing development toolkits (like Jena) can handle data of such size. What will happen if I import all dbpedia core datasets into Jena? any suggestions on utilizing dbpedia data effectively? uHello, jiusheng chen wrote: Yes, it is possible to load it in Jena (based on an SQL database), Virtuoso, and Sesame: Kind regards, Jens uJens Lehmann wrote: All, Note, Virtuoso now has Native Storage Providers for Sesame, Jena, and Redland (about to be released). See: If you want to use the providers against DBpedia we will have to come through the Virtuoso SQL port (which isn't currently open to the public). That said, we are planning to open up the SQL port as direct API level access to DBpedia will be even faster this way. uIs DBpedia an opensource effort? I am especially interested in RDF model building and SPARQL query endpoint. On Fri, Aug 22, 2008 at 7:51 PM, Kingsley Idehen < >wrote: ujiusheng chen wrote: DBpedia is an \"Open Data\" effort. Kingsley" "Why super classes of external ontologies are not included in resources types ?" "uHi, I wounder why not super concepts from ontologies such as UMBEL and Wikidata are not included in resources types, while DBpedia ontology concepts and Yago classes are included with their super classes ? Thanks. Hi, I wounder why not super concepts from ontologies such as UMBEL and Wikidata are not included in resources types, while DBpedia ontology concepts and Yago classes are included with their super classes ? Thanks. uOn Mon, Oct 5, 2015 at 10:42 PM, Nasr Eddine < > wrote: We would have to stop importing additional ontologies at some point. The Yago choice dates back to the beginning of the project were I was not involved but the Yago IRIs are easily aligned with the DBpedia IRIs so this might have been the reason. if there is strong support for importing additional type statements we can consider adding them. One approach would be through community contributions Best, Dimitris" "president type" "uHi, I wonder why \"Richard Nixon\" was a dbo:President in the last DBpedia release but no longer be a dbo:President in this release. Thanks, Lushan Hi, I wonder why 'Richard Nixon' was a dbo:President in the last DBpedia release but no longer be a dbo:President in this release. Thanks, Lushan uHi Lushan, Thanks for reporting the issue. You can try to find the answer by comparing the versions of the Wikipedia page as of 3.6 release and as of 3.7 release. The exact date is on the mapping wiki. If the infobox changed, this could explain the difference, and adjusting the mapping on the mapping wiki would likely fix it. I have also observed that multiple values in a infobox property within a page can be tricky to extract (e.g. if a link or image was added beside the value). So you can look for that also. Cheers Pablo On Nov 3, 2011 10:16 PM, \"Lushan Han\" < > wrote:" "Hi. How with ld_dir_all ( , , ) load dbpedia in local machine, also how in original dbpedia" "uHi. How with ld_dir_all ( , , ) load dbpedia in local machine, also how in original dbpedia. uThis tutorial is yet complete / reviewed but can get you started On Wed, Jul 9, 2014 at 2:28 PM, ок Андрей < > wrote: uThis one is not official but we've used it a couple of times and it works quite well. You can use the lod stack packages to get around compiling virtuoso though. Cheers, Alexandru On Jul 9, 2014 1:34 PM, \"Dimitris Kontokostas\" < > wrote:" "key integrity, data "halos", etc." "uIn the last few days I've been trying to create a set of shapes of administrative divisions in the world; at the very least I'd like to go one subdivision below country; the immediate purpose is to use the shapes to segment out coordinates of dbpedia topics into countries and administrative areas to give myself some ability to geotarget. Anyway, doing that, I ran into one of those amusing anomalies in how wikipedia/dbpedia is built. I found some shapes that appear to be named after three-letter IOC codes for countries, so Germany is \"GER\" instead of the iso digraph \"DE\". No problem, but this kind of thing that means there's no rest for the wicked. Anyway, I've got country names, so I'm probably just going to string match 95% of the names and do the weirdos by hand, but it got me to thinking that \"IOC country code\" is a property of a country. These are represented in that list, but there's the obnoxious thing that these don't link to the countries, but instead link to pages like The IOC code is infoboxed, which is good, but there's no reliable link back to the country. Most of these pages have a wikilink that points to the actual country, but often they have links to other countries too, for instance, Ghana points back to Now, it turns out that Ghana's infoboxes point to some pages that have some really rich information Unfortunately, at the moment, dbpedia only knows these as wikilinks. Anyway, it's just a good example of what you've got to deal with when you're extracting data from dbpedia." "Building a Chinese dbpedia SPRQL endpoint" "uDear All, As you know there is no Chinese Sparql endpoint (As far as I know). I have installed a local virtuoso server and want to create a local Chinese SPARQL endpoint. Looking at the (for example Japanse) and Assuming I dont need geo, person, redirects, data, Looks like the basic elements I need are 1- instance_types_ja.nt.gz : to get the DBPedia types for entries 2- mappingbased_properties_ja.nt.gz or the raw_infobox_properties_ja.nt.gz for the relationships between NE 3- labels_ja.nt.gz for labels 4- article_categories_ja.nt.gz for articale categories The above files are absolutely necessary to make and enpoint with minimum functionality So I imported them for local Japanese and they look fine so far ( did not test much though) Now, If I do the same for Chinese, first of all, there is NO instance file there. Without this file, that is not much that could be done, right? What should I do Thanks Dear All, As you know there is no Chinese Sparql endpoint (As far as I know). I have installed a local virtuoso server and want to create a local Chinese SPARQL endpoint. Looking at the Thanks uHello, Looks like chinese files are there Cheers 2014-09-04 18:00 GMT+02:00 Hamid Ghofrani < >: uHello Hamid, There are no mappings for Chinese - the list at on mappings - that's why the instance_types file does not exist. (Same for the mapping-based properties.) Go ahead and add some mappings for Chinese! That would be a great contribution. Cheers, JC On Sep 4, 2014 11:49 PM, \"Romain Beaumont\" < > wrote: uHi, there Thank Jona's response. The Chinese mappings are still low that's why the instances are so few. I'll try to do more XD. Cheers, Dongpo On Sat, Sep 6, 2014 at 8:29 AM, Jona Christopher Sahnwaldt < > wrote:" "Dbpedia on local PC" "uHi, I would like to know how can I install Dbpedia on local machine to query the dataset like a SPARQL endpoint. Best regards Samir De : baran_H < > À : Max Jakob < > Cc : Envoyé le : Ven 29 avril 2011, 17h 36min 17s Objet : Re: [Dbpedia-discussion] German version of 3.6 download files broken/truncated? On Fri, 29 Apr 2011 16:55:24 +0200, Max Jakob < > wrote: Hello Max, yes, that is ok, next week, we say from Wednesday on, i will begin checking online with my app. And when i get again problems i will report on them here. Thank you very much, also to Neubert and Anja, their contribution to this issue was essential, Baran uHi Samir, You can setup a local equivalent of the Dbpedia SPARQL endpoint at Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: [1] [2] [3] On 2 May 2011, at 20:57, Samir Bilal wrote:" "single place for dbpedia, freebase, yago, linkedmdb and others" "uHello dbpedia enthusiasts, I'm a big fan of the Linked Open Data cloud and looking at the activity around dbpedia, I'm convinced there is a lot of value in this data! Recently I was wondering if there is anyone interested in having a single place where all the major datasets from Linked Open Data cloud would be available for querying. This way, queries could be run across them, i.e. filter, join, aggregate, traverse, you name it What do you think? Cheers, Julius Chrobak co-founder of mingle.io - Query API of Open Data @julochrobak Hello dbpedia enthusiasts, I'm a big fan of the Linked Open Data cloud and looking at the activity around dbpedia, I'm convinced there is a lot of value in this data! Recently I was wondering if there is anyone interested in having a single place where all the major datasets from Linked Open Data cloud would be available for querying. This way, queries could be run across them, i.e. filter, join, aggregate, traverse, you name itWhat do you think? Cheers, Julius Chrobak co-founder of mingle.io - Query API of Open Data @julochrobak u0€ *†H†÷  €0€1 0 + uHi, Kingsley, How do you go about getting your hands on the LOD Cloud? Do you just crawl? Do you use the metadata in datahub.io? Do you ride Sindice? If one had some LOD sitting around, how do they fall into your conglomerate? Thanks, Tim On Jul 12, 2013, at 8:13 AM, Kingsley Idehen < > wrote: ucool. I have not seen it before. Let me have look… one more quick question. Is the SPARQL end point reliable? Can it be used for real time queries from apps? Or would you recommend it solely for ad hoc queries? Cheers, Julius On Jul 12, 2013, at 3:23 PM, Timothy Lebo wrote: u0€ *†H†÷  €0€1 0 +" "SKOS, Eponymous Categories, and Main Articles" "uHi, I'm a graduate student working on a research project using Wikipedia's categorization system. I stumbled on to DBpedia while trying to discover the best way to access this data and have been plunging myself in to the wonderful world of the Semantic Web, RDF, SPARQL, DBpedia, and the inner workings of Wikipedia ever since. First, thanks for such a wonderfully structured access point to this information! Second, in the application I'm developing I'd like to combine 2 (or more, potentially) resources into one idea. For example, the article \"Pasta\" and category \"Category:Pasta\" both describe (roughly) the same idea. Right now I'm just matching labels. This becomes a problem, though, when the resources don't have the same name. Pluralization is one of the largest differences (e.g. \"Sandwich\" vs. \"Category:Sandwiches\"), but there are many cases that can't be fixed by a simple pluralization change. Wikipedia, though, has a system for manually denoting these \"eponymous categories\" and assigning main articles to categories. For example, see Wikipedia's category page for Pasta [1]. At the top we are given a link to the \"main article for this category\". At the beginning of the list of pages in the category we see \"List of Pasta\" and under the * we see the articles \"Pasta\" and \"Noodles\". The systems for denoting these connections are described in detail at [2], [3], and [4]. This information, however, is not present in DBpedia. I feel like this would be a valuable addition to the category information already available, but I don't want to pretend to know how to go about extracting this information or how to denote it (some SKOS predicate maybe?). Does anyone else think this would be useful information? Anyone super familiar with the current extraction framework and would know how doable this is or why it hasn't been done before? I know I won't be able to work it into my project (deadlines) but this project has exposed me to so many new ideas and technologies that I now feel invested in them all. Cheers, Matt [1] [2] [3] [4] Wikipedia:Categorization#Typical_sort_keys uHello, Matt Mullins schrieb: [] It hasn't been done, because we were not really aware of it. Linking category and main article could be useful and is doable within the extraction framework (by adding a new extractor which searches for certain patterns on category pages). However, if I understand you correctly, you will not have time to do this yourself.(?) If this is the case, then you can add it as a feature request to our tracker: However, at the moment we have some urgent items on our ToDo list, so we cannot make any promise on whether and when we implement it. Kind regards, Jens uOn Thu, Jul 16, 2009 at 11:35 PM, Jens Lehmann < > wrote: Yep, I'm basically saying I can't do it soon so if anyone is interesting in investigating it, please do. That being said, if no one gets to it I could definitely see myself trying to dive into your development environment and contributing, I was just assuming it would be a lot easier for people already familiar with what's needed. Me being such a newbie at the semantic web, it would take a good deal of investigation and a lot of questions to you all, but it definitely sounds intriguing. Thanks for your input, Matt" "Status of live.dbpedia" "uHelloooooo everybody, I am a newcomer in this \"Linked Data\" community, and I do not know much about anything really. I started using dbpedia a couple of weeks ago, to link some of its information with a software I contribute to [1], and this led me to update wikipedia on many points. Now, I am waiting for live.dbpedia to register those changes so that I can resumoe working on this database of mine, but it seems that live.dbpedia is not live anymore. Does anybody know when it will match wikipedia again? Thank youuuuuuuuuuuuuu, Nathann [1] uHi Nathan, This is a known issue, see: and Cheers! On 7/2/15 1:03 PM, Nathann Cohen wrote:" "Multi-Type Named Entities Can not be found in English DBpedia" "uDear All, I did the following query for english DBpedia select ?s where { ?s a dbpedia-owl:Actor;a dbpedia-owl:MusicalArtist } limit 10 and to my surprise, returned nothing. In general I have never been able to find a Named Entity with multiple types that are not a s subset of each other in DBpedia ontology (Like dbpedia-owl:Actor, dbpedia-owl:Athlete,) Why is that? Am I doing something wrong? For example Jennifer_Lopez and Elvis_Presley are both Actor and MusicalArtist, but For Elvis_Presley, the DBpedia only types are just Thank you Dear All, I did the following query for english DBpedia select ?s where { ?s a dbpedia-owl:Actor;a dbpedia-owl:MusicalArtist } limit 10 and to my surprise, returned nothing. In general I have never been able to find a Named Entity with multiple types that are not a s subset of each other in DBpedia ontology (Like dbpedia-owl:Actor, dbpedia-owl:Athlete,) Why is that? Am I doing something wrong? For example Jennifer_Lopez and Elvis_Presley are both Actor and MusicalArtist, but For Elvis_Presley, the DBpedia only types are just you uWikipedia has this: {{Infobox person | occupation = Singer, actor | module = {{Infobox military person | module2 = {{Infobox musical artist | instrument = Vocals, guitar, piano | background = solo_singer | genre = {{flat list| *[[Rock and roll]] *[[Pop music|pop]] You can find the newest extraction here: Unfortunately DBpedia processes only the first two infboxes (Person and Military person) but not Musical artist. It even skips the instrument, background and genre fields from the third infobox (Musical artist). Gerard Kuys has remarked that DBpedia picks only one leaf class \"to avoid contradictions\". I can understand that various infoboxes scattered throughout the article could contribute non-sensical classes, especially if they have non-sense mappings as described here: However: - The above two \"modules\" are not randomly scattered, they are embedded in the main infobox template - How is \"contradiction\" defined\"? Definitely the subclasses of Person are *not* disjoint, there are numerous examples. I posted issue Also: it is not currently possible to examine one field (like \"occupation\" above) and emit two classes: see here dbpedia-problems-long.html#sec-3-1 usometimes it helps to look at the state of the articles at the time of extraction DBpedia assigns a single type for each resource and creates separate ones for subsequent mapped templates if they are not direct subclasses/superclasses of the first mapped template in this case we have an infobox Person followed by a infobox military person One problem is that we do not process embedded templates (Infobox musical artist)which is mainly a design issue. I am not aware who made it in the past, it is quite easy to change it but not sure of the implications of such a change A view of the current extraction can be better seen in On Wed, Feb 18, 2015 at 3:24 PM, Vladimir Alexiev < > wrote:" "Wiktionary extraction help" "uDear all, I tried recently to extract data from a french Wiktionary dump with the extractor of the community on github. But this create strange data like this, for every word. Few data about every word, word from other languages are parsed too. I don't know if it's normal. . . \"encyclopédie\"^^ . . . . . \"7-139\"^^ . . . \"accueil\"^^ . . . . . \"7-112\"^^ . If you have any idea of what is happening. With regards Raphaël Boyer WIMMICS TEAM INRIA France Dear all, I tried recently to extract data from a french Wiktionary dump with the extractor of the community on github. But this create strange data like this, for every word. Few data about every word, word from other languages are parsed too. I don't know if it's normal. < France uHi Raphael The Wiktionary mappings are unmaintained for quite some time now. You are more than welcome to update them and match the current wiktionary structure or look at other options such as dbnary. Best, Dimitris On Mon, Nov 16, 2015 at 5:59 PM, Raphael Boyer < > wrote: uHi, also consider that each wiktionary aims to contains every single word for each human language. So you can't do the assumption of reading only italian Wiktionary to have a list of all the italian words. The best approach is to read all Wiktionaries looking each one for italian words. In my case I used both english and italian wiktionaries to retrieve a huge list of italian words (other editions were irrelevant). Cheers, Riccardo 2015-11-17 10:09 GMT+01:00 Dimitris Kontokostas < >:" "BE lang in OntologyClass and mapping statistics" "uHi everyone, Can you tell me how to fix these problems? 1. OntologyClass page doesn't display label for the Belarusian language. Examples: 2. DBpedia mappings statistics shows incorrect information. For example, Wiki template \"term_start\" and is used in plenty of Wiki-articles. But in DBpedia mapping statistics the property \"term_start\" doesn't have instances (\"property is mapped but not found in the template definition\"). Thanks in advance. Best regards, Włodzimierz Hi everyone, Can you tell me how to fix these problems? 1. OntologyClass page doesn't display label for the Belarusian language. Examples: Włodzimierz uHi Włodzimierz On Fri, Mar 6, 2015 at 4:19 PM, Wlodzimierz Lewoniewski < > wrote: You can create an account in the mappings wiki and ask for editor rights to add the missing labels you can find a complete list here Templates change over time and these terms existed some time ago. We should clean these up indeed" "Error in results for David Allen Green" "uCan anyone please tell me why shows two occupations for David Allen Green; the second of which is the bogus \"David_Allen_Green1\"? I've checked and can't see where it's coming from." "questions about dbpedia 3.7" "uDear list, I set up a local dbpedia mirror with virtuoso these days, the version is dbpedia 3.7. I happen to notice that there is a rdf:type of (sorry that I don't remember the full namespace) for on this a problem happening on my dataset, or this is the update? What is the dbpedia version for BTW, what is the total count of triples in dbpedia 3.7? Mine is 227052605, because I encountered some problems concerning very long IRIs, I moved lines, but not many. Are my triples right? best regards, June uHi June, On 01/16/2012 12:39 PM, June wrote: Are you sure that you have also loaded the dataset titled \"Links to YAGO2\" to your DBpedia mirror? I think that you haven't loaded it." "why dbpedia ontology is not complete" "uHi, Compared with YAGO ontology (using the namespace prefix yago and dbpprop), I found that the recently released DBPedia ontology (using namespace dbpedia-owl) is somehow missing some triples it were supposed to have. I understood that many terms in YAGO ontology are not included in DBPedia ontology. However, regarding to the terms both in YAGO ontology and DBPedia ontology, those in DBPedia ontology have less instance data. For example, the resource dbpprop:spouse but misses dbpedia-owl:spouse. Another question: why Bill Clinton has not the rdf:type dbpedia-owl:President. Thanks, Lushan Han uOn 4/6/11 5:57 PM, Lushan Han wrote: We are working on YAGO2 dataset, it will soon be loaded, once this is done you can revisit this matter in a more productive way. For now, just wait for the upload of YAGO2 dataset. Kingsley" "Quepy, transform questions in natural language into queries in a DB language." "u*We are sharing an open source framework Quepy queries in a database language. It can be easily adapted to different types of questions in natural language, so that with little code you can build your own interface to a database in natural language. Currently, Quepy supports only the SPARQL query language, but in future versions and with the collaboration of the community we are planning to extend it to other database query languages. You are invited to participate and collaborate with the project. As a proof of concept to illustrate the usage and potential of the framework, we developed an interface between some kinds of questions and some of the content of the dbpedia. You can see the result of this small example at We leave here links to the documentation [0], the source code [1], and also a Pypi package [2]. Also, as an example, we have an online instance of Quepy the interacts with DBpedia available [3]. Source code for this example instance is available within the Quepy package so you can kickstart your project from an existing, working example. [0] [1] [2] [3] quepy.machinalis.com (Don't expect a QA system, it's an example) [4] quepyproject[(at)]machinalis.com* We're doing an online hangout to show example code and answer questions about the project: Hangout event on Wed, January 30 at 19:00 UTC Also we invite you to try the new user interface: Regards" "Idiots guide to dbpedia" "uHi I want to build a domain ontology for wildfires. I understand dbpedia extracts linkages from Wikipedia. I tried a basic sparql query (I do not know sparql) to extract all references to fire and got something about a community fire unit. Wikipedia provides much more than that so something is wrong. Can somebody give me a 20s tutorial how to extract RDF information on wild fires from dbpedia or am I being completely naïve here? I am a Geologist by training so be kind :) Cheers" "DBpedia Relationship Finder Release" "uHello, I am glad to announce the DBpedia Relationship Finder, which is developed by Jörg Schüppel and me: It explores the DBpedia infobox data set to find out which relationships exist between two things. It can answer questions like \"How are Leipzig and Semantic Web related?\". The relationship finder provides another useful and especially easy to use interface for DBpedia. Best regards, Jens uHi Jens, Very nice. It passes the Six Degrees of Kevin Bacon test! (See saved query.) Please pass on kudos to Jörg as well. A few initial observations, however: 1. Some of my queries (e.g., automobile AND chevrolet) seemed to produce duplicate results paths 2. I think shorter depth paths than the maximum specified should *also* be shown in the results, and likely be presented first 3. This example shows why ongoing means for structure clean up will be needed. Virtually every one of my queries connected through the United_States as a location. I suspect quite a few shared relations may prove to be either too broad or too narrow for real utility. But, what can/should be the mechanism to identify and exclude them? 4. Similarly, it would be nice if the interface handled/replaced the underscoring of terms. But these are quibbles, and do not detract in any way from how useful such a query format is. :) I also really like the field auto-completion, which is really important with no a priori knowledge of the objects in Wikipedia. Thanks! Mike Jens Lehmann wrote: uJens, Well donethis is another great way to meaningfully view the dataset. I especially like being able to click on the connections to get the full infobox that yielded the connection. It's an important piece of letting the user assess the origins of the connections, which I think librarians and folks in higher education will value greatly. Picking up on Mike's third point uHello Michael, Michael K. Bergman schrieb: Jörg ist reading the list as well. I am glad that the relationship finder passed your tests. :-) We will have a look as to why this happens. There is still a small bug in the search, which we will fix. However, the shortest paths should always be shown. Sometimes one thinks that there should be a shorter path through the data, but in fact there isn't. This is often the case for objects, which do not have Infoboxes in Wikipedia (or do not have one particular attribute, which might be interesting). Do you have a concrete example where the shortest path is not shown? We thought of not completely excluding such things, but let the user say which objects/properties should be ignored for a concrete query. However, this requires some technical and user-interface changes, which need some time to be implemented. So we decided to release without this functionality and add it afterwards. We will use/generate nice labels everywhere to get rid of underscores, %C2 etc. It is also nice if you just want to play with the data by more or less randomly entering some letters. :-) Jens uHello, Patrick Gosetti-Murray-John schrieb: Thanks. I also saw your Blog entry [*]. :-) uTo Jens and Jörg et al, I think what you have produced here is a new user interface exemplar, one of likely more to emerge as new ways are discovered to explore and exploit RDF graph data. I would like to suggest the two of you, as authors, post this to the linking-open-data group as well. (I would do so, but you're the innovators and I don't want to be presumptuous! :) ) It seems to me that a new ESW wiki section covering user interface and query alternatives could be a great resource to developers and data publishers moving forward. This linkage diagram is certainly one of those alternatives, joining network graph, conventional graphs, tabular, triples, timelines, maps, resource records (a la ZoomInfo), etc. Again, nice work, and I think discussions such as this, plus new tweaks and tests of the paradigm, will naturally evolve to the best combination of functionality and interface. Thanks, Mike Jens Lehmann wrote: uHello Mike, Michael K. Bergman schrieb: Yes, in principle one could use the relationship finder for arbitrary RDF stores. (This is one of the nice things in the Semantic Web.) No problem, just go ahead. :-) I agree. (Admittedly, I don't use this Wiki very often yet. Imho structure/content could be improved.) Jens" "All birds are pink" "uHi all I was giving tonight a first lesson of SPARQL to my son, who is both a Wikipedia addict and bird watcher. Starting from winter visitor in our garden) to figure a few properties to use in a simple query, we remarked the 'p:color pink', a bit approximative for the Hawfinch. Checking the Wikipedia page, we found no obvious explicit mention of this pink color, neither in the infobox content, nor in the article text. Curious, we tried to found out other values of p:color for birds through the following query. PREFIX p: PREFIX dbpedia: SELECT DISTINCT ?c WHERE { ?x p:classis dbpedia:bird. ?x p:color ?c. } Well, we got three values 'pink', 'Pink', or 'none'. What did that mean? All birds in Wikipedia are pink or without color? Or some fanatic of pink has spammed the birds articles? So we checked again the Wikipedia Hawfinch page, and the taxobox source code in edit mode {{Taxobox | color = pink | name = Hawfinch | status = LC | image = CoccCocc.jpg | image_width = 240px | regnum = [[Animal]]ia | phylum = [[Chordate|Chordata]] | classis = [[bird|Aves]] | ordo = [[Passeriformes]] | familia = [[Fringillidae]] | genus = ''[[Coccothraustes]]'' | species = '''''C. coccothraustes''''' | binomial = ''Coccothraustes coccothraustes'' | binomial_authority = ([[Carolus Linnaeus|Linnaeus]], 1758) }} Indeed color = pink is there. Now go back to the article and look at the taxobox display. Well, pink is there, after all. Which leads to interesting questions. How can the automatic extraction sort the stylesheet properties of the taxobox from the properties of the subject it describes? No way as is now. If Wikipedians knew best, they would split them neatly in the taxobox code. I wonder if infobox templates creators would care to do something about it? Simply replace \"color\" by \"taxobox_color\" would help. Note the subtle semantic ambiguity for image and image_width. The property image can be viewed as a property of the Hawfinch (the value is an image of some Hawfinch), although it's there to call an image in the infobox, along with image_width, so it is also a property of the infobox, somehow. Bernard upink signifies that in wikipedia it is an animal. its an artifact of older wikipedia templates, and more generally that wikipedia is very much about display markup rather than semantic markup. the colors are discussed here: -jg Bernard Vatant wrote: uHello John determined by regnum = Animalia. But you can still override it by an explicit color declaration. Someone not aware of the semiotic rules under the template (pink is for animals, green for vegetal, etc.) can change the color to \"green\" or \"blue\". Nothing prevents you to do so, I just tried. :-) Same remark for the image_width, BTW. Which means the proper use of the template would provide a more semantic markup, and certainly a bot could clean up all useless display markup in the legacy Discussing that again with my son at breakfast, he told me that this was indeed a flaw in the English wikipedia taxobox practice (a soft translation of what he actually said :-[ ). In the French Wikipedia, the color is also built-in in the \"Taxobox animal\" stylesheet, but there is no way to override it (no \"couleur\" slot is available). No wonder la France est le pays de l'Encyclopédie ;-) . Bernard u*VERY* interesting example Bernard, thanks for pointing it out. For me dbpedia provides a wealth of evidence that at least some high-level control of the vocabulary used would be valuable; this is the best example so far in that I think anyone can understand the separation of concerns between information about the display properties of the page vs. information about the thing being described. -Chris Bernard Vatant wrote: uChris Welty wrote: The problem is, that Wikipedia authors currently create and design templates often solely with the goal of having nice infoboxes on the Wikipedia pages. For the infobox extraction we decided to have one global property namespace, so that the \"population\" attribute in the us_cities templates becomes the same DBpedia property as the \"population\" attribute in the german_cities template. This results in the ambiguities Bernard mentioned, but has the advantage, that you can easier query DBpedia in many cases. It is very easy to resolve the ambiguities, by: * defining attribute names globally unique in Wikipedia * introducing a new Wikipedia pagespace for template attributes and let template authors shortly describe their attributes therein. (They could even identify different attributes by simply creating Interwiki links or a small template with a sameAs attribute.) * write a small bot who resolves ambiguities by renaming attributes in certain templates for cases as the one Bernard mentioned If we have the resources for that, we will definitely do it if nobody else did before. Best, Sören uChris Actually I found the example extremely interesting because I got misled myself, although I meet this kind of ambiguity almost daily in explicitation of data semantics with data providers, and I know quite well the why's and how's of Wikipedia infoboxes. Sort of caricature of the daily problems in knowledge extraction from structured data. Sören \"Solely\" is a bit harsh. First aim of infoboxes is to sum up elements of description of the article's subject. Otherwise any knowledge extraction from the infoboxes, such as the one dbpedia performs would be completely meaningless. As the French Wikipedia example shows, it is that the display properties can be hidden under the hood of the style sheet, so that the infobox content is actually a structured description of the thing the article is about, not a self-description of the infobox display. People building infoboxes can understand that. I guess many of them are aware of the data structure / stylesheet distinction, and apply that to the templates they build. Doing so, they make their information much more usable for knowledge extraction. I'm afraid you somehow miss my point here. The problem is not the ambiguity of \"color\" semantics. It's used consistently to represent a color. Pink is pink, be it a bird or an infobox. The ambiguity is at the level of the *subject* to which this property is applied, and origins in the fact that properties applying to different subjects are presented at the same data level, without any possibility to sort them out. It's a completely different issue to decide if \"color\" for birds has the same semantics as \"color\" for quarks. See > It is very easy to resolve the ambiguities, by: In theory, yes. Practically impossible in Wikipedia. Who will define that? Seems to me like defining attribute names globally unique in the Semantic Web. Neither possible nor desirable. This suppose that Wikipedians are more interested in the indirect use of Wikipedia content (data extraction) than its direct use (human consumption). For things much more simple, such as a consistent use of geographical coordinates template, it's already very difficult to achieve a consensus, because many Wikipedians don't want to change their practice to take in consideration the re-use of Wikipedia data. And they have good arguments to do so, based on the independence of the encyclopedia from any external interest (the famous NPOV argument). This can be done on a pragmatic case-by-case basis. Tle more so that in the very example of the animal taxoboxes, the \"color\" property is totally useless > If we have the resources for that, we will definitely do it if nobody What do you mean by that? See above, and be cautious not to provoke a reject reaction from Wikipedia's side. There are a lot of services who now want to use Wikipedia data with more or less (u)nclear business motivations, and will for this purpose, if needed, modify the source content of Wikipedia in such and such a way that fit their needs. This is likely to make Wikipedians more and more nervous about external pressures on the content. Bernard uBernard Vatant wrote: I think if the template attribute you refer to would be named \"infobox-color\" everything is perfectly clear. Of course you still can delve into philosophical discussions whether \"infobox-color\" should be applied to birds, but honestly: I do not care, since it has absolutely no impact. I think it is practically very easily resovable, it will take some discussions with the people responsible for introducing a new namespace, creating awareness in the Wikipedia community and maybe some minor Mediawiki modifications (but I think even they are not really needed). Very possible and desirable! (With globally I did not mean for the whole world, but for the whole Wikipedia.) Many things in Wikipedia are unique: author names, article names, image names etc. It just requires a new namespace for template attributes (and the creation of a page for each template attribute in this namespace). If I would not believe that this is possible I would have stopped working on DBpedia, since without this DBpedia does not make sense. Once a namespace for template attributes is created and there is a page for the infobox-color attribute, we can just include the following template there: {{infobox-attribute| rdf:type=owl:AnnotationProperty }} This will enable the DBpedia extractor to create suitable info about the used properties. Also, it enables to identify different properties by means of a 'owl:sameAs' attribute in the 'infobox-attribute'. Hence, even without changing anything in Wikipedia, we can identify the \"pop\" attribute in the German_cities infobox-template with the population attribute in the US_cities infobox-template. I think the purpose and impact of such a template-attribute namespace is easy to understand and I do not expect lengthy fights about that in the Wikipedia community and as you mentioned earlier Wikipedia template designers are already aware that well designed templates are better. Best, Sören uHi Sören I see you are much more optimistic than I am on the capacity of Wikipedia to accept widespread controlled \"semantic augmentation\". On which basis? Are you active yourself, or someone else from dbpedia, in the Wikipedia community in order to help such things to happen? Have you engaged discussions about it with Wikipedia folks, and with which feedback so far? Supposing the template attribute namespace is defined, who will annotate the attribute pages with the OWL annotations? Do you expect the template-builders to do it? I don't want to seem over-skeptical about it, but just wondering if you have already thought about the real-life implementation, or if it's just so far simply a scenario Bernard Sören Auer a écrit : uBernard Vatant wrote: We have very strong contacts to the German Wikimedia e.V. chapter and some of our students in Leipzig are active Wikipedians and even Mediawiki developers. I think once you show the Wikipedians the advantages of a lightweight semantification of the template system they will be very interested and eager to make it really happen. Our problem right now is just that we do not have the funds to intensively work on that - we are in the process of acquiring funds to do so, but (as you know) this is very time consuming and lengthy Best, Sören" "Mapping-based Property errors" "uI'd like to point everyone's attention to: \"Errors detected in the mapping based properties. At the moment the errors are limited to ranges that are disjoint with the property definition\". - In many cases this reflects inflated expectations by ontology authors as to what Wikipedia editors put in fields. - But the ERRORS file itself seems to be old (doesn’t correspond to current dbpedia data). E.g. see Let's prefixize it by using these prefixes riot uHi Vladimir, On Mon, Sep 7, 2015 at 5:56 PM, Vladimir Alexiev < > wrote: Yes, this remark is correct, wrong modeling can lead to wrong errors :) I don't get the old part here. All static versions are based on older versions of wikipedia. But, all these datasets originate from the same wikipedia dumps and the errors are generated by processing the mapping-based-properties file. The mapping-based-properties file is then split to two datasets, correct / errors and only correct is loaded in the endpoint, the error dataset is provided for completeness." "How to extract onProperty, someValuesFrom, intersectionOf for each class in OWL file" "uDear All, I need to extract onProperty, someValuesFrom, intersectionOf for each class in owl file. I created SPARQL query , but it does not extract all onProperty, someValuesFrom, intersectionOf for each class that included in my file here is the SPARQL query : PREFIX abc: PREFIX ghi: PREFIX mno: PREFIX owl: PREFIX rdfs: PREFIX rdf: SELECT (COUNT(*) AS ?no) ?class ?subClassOf ?ObjectProperty ?someValuesFrom ?intersectionOf WHERE{ ?class a owl:Class . FILTER( STRSTARTS(STR(?class),\" This is my OWL file: xml version='1.0' encoding='%SOUP-ENCODING%' haulage worker I got this output :" "Fw: DBpedia look up not responding" "uSir/ Ma'am, I am using the DBpedia lookup online service for my work. However, the APIs have outdated and I am facing a roadblock. I see that you have suggested using the docker images of look up hosted locally. Could you kindly elaborate in more detail, preferably stepwise in layman terms as a tutorial for beginner. I have also tried following the instructions on the following links and had difficulty locating the default index path file. [ DBpedia Lookup Service - Find DBpedia URIs for keywords. This documentation page might be outdated. Please also check [ README.md DBpedia Lookup. DBpedia Lookup is a web service that can be used to look up DBpedia URIs by related keywords. Related means that either the label of a Janaki Joshi, Web Science Lab, International Institute of Information Technology - Bangalore" "select distinct properties" "uHi! I find extremely useful to execute some SPARQL queries to get familiar with the dataset. One of these queries is default for Virtuoso (get all classes), I also like this one: select distinct ?y WHERE { ?x ?y ?z } LIMIT 100 It was very surprising when I found that such a simple query will lead to \"Virtuoso S1T00 Error SR171: Transaction timed out\" on Sincerely yours, Yury uHi Yury, You are basically doing a table scan, on all of the triples in the db to make a hash of all the unique values of P, then return the first 100. The full execution time for this type of query is larger than we currently allow any query to run for on dbpedia.org, depending on how many other substantial queries are running at the same time. What you can do on any Virtuoso /sparql endpoint, is to use an ANYTIME query, by filling in the Execution Timeout field on the form: If you click on the above link and press the \"Run Query\" button, Virtuoso will collect as many unique values of P it can find in 50 seconds and then return up to a maximum of 100 results, depending on how many unique values it collected in that time period. Patrick uHi Patrick! Now I see the problem, thanks! Just in case I'll be in a need of speeding up this kind of queries for my storage: can I reduce this problem with indexing? Sincerely yours," "DL-Learner 1.0 (Supervised Structured Machine Learning Framework) Released" "uDear all, the AKSW group [1] is happy to announce DL-Learner 1.0. DL-Learner is a framework containing algorithms for supervised machine learning in RDF and OWL. DL-Learner can use various RDF and OWL serialization formats as well as SPARQL endpoints as input, can connect to most popular OWL reasoners and is easily and flexibly configurable. It extends concepts of Inductive Logic Programming and Relational Learning to the Semantic Web in order to allow powerful data analysis. Website: GitHub page: Download: ChangeLog: DL-Learner is used for data analysis tasks within other tools such as ORE [2] and RDFUnit [3]. Technically, it uses refinement operator based, pattern based and evolutionary techniques for learning on structured data. For a practical example, see [4]. DL-Learner also offers a plugin for Protégé [5], which can give suggestions for axioms to add. DL-Learner is part of the Linked Data Stack [6] - a repository for Linked Data management tools. We want to thank everyone who helped to create this release, in particular (alphabetically) An Tran, Chris Shellenbarger, Christoph Haase, Daniel Fleischhacker, Didier Cherix, Johanna Völker, Konrad Höffner, Robert Höhndorf, Sebastian Hellmann and Simon Bin. We also acknowledge support by the recently started SAKE project, in which DL-Learner will be applied to event analysis in manufacturing use cases, as well as the GeoKnow [7] and Big Data Europe [8] projects where it is part of the respective platforms. View this announcement on Twitter and the AKSW blog: Kind regards, Lorenz Bühmann, Jens Lehmann and Patrick Westphal [1] [2] [3] http://aksw.org/Projects/RDFUnit.html [4] http://dl-learner.org/community/carcinogenesis/ [5] https://github.com/AKSW/DL-Learner-Protege-Plugin [6] http://stack.linkeddata.org [7] http://geoknow.eu [8] http://www.big-data-europe.eu" "Data Points in Australia coast" "uHello I have been working with dbpedia the last months, I also asked something about the point values on the mailing list, and you already gave me good advice on other semantic datasets to check. I continued working with dbpedia because I want other information provided about the Points datasets, relations, wikipedia abstracts etc. But I would like to ask again the question on how the point values are generated. Recently I plotted DBpedia points on a map: To determine the most populated zones. There are some errors, but I assume that it's okay, it is impossible not to have errors. What worries me is the kind of duplication found on Australia: For example, the N-triples for Melbourne lat,lon are the following: uHello Jordi, these are nice images, maybe you can add them and/or your project to wiki.dbpedia.org and link it at: For the generation of geocoordinates a two step process is involved, which includes some Java/Scala code, i.e. the Geoparser and also some mapping rules in the Mappings Wiki for example see the Geocoordinate mapping on I would suggest the following: DBpedia 3.8 is almost ready and maybe the bug with the duplicates is fixed already, so we don't have to do anything. You might also want to try the live endpoint: All the best, Sebastian Am 25.07.2012 15:35, schrieb jordi castells: uoopsdidn't realize this didn't go to the list :) uThanks Sebastian & Pablo for your responses. I checked, as you suggested, the live endpoint but the problem persists. Not on all the points, for example Melbourne is okay, but objects. I also would like to note that this page is not in sync with Wikipedia page of Prahran. I'm thinking of the possibility that the extractors source code is okay now, but not when the articles were extracted. Last update of this Wikipedia page is 18 July 2012. I checked the added triples on reference to the Prahran resource. And thanks, I also like the images. When it's done I will upload them at the wiki. Salut! * uHi all, some Wikipedia pages contain coordinates in multiple places, that's why DBpedia extracts multiple values. Some pages using latd values that seem to say that the place is in the Northern hemisphere. Examples: | coordinates = {{Coord|37|48|49|S|144|57|47|E|type:city(4000000)_region:AU-VIC|display=inline,title}} Looks correct - |S| means southern hemisphere, and DBpedia extracts this correctly. | latd =37 |latm =48 |lats =49 | longd =144 |longm =57 |longs =47 To me (and to DBpedia), this looks like Melbourne is in the Northern hemisphere. Similar for {{Coord|-37.852|144.998|format=dms|type:city_region:AU-VIC|display=title}} looks correct |longd=144.998|latd=37.852 looks like Northern hemisphere. This needs further investigation. If the values are wrong, we should find out how many pages are affected. Or maybe Template:Infobox_Australian_place expects the latd value to be positive, because it knows that all values will be in the Southern hemisphere anyway? In this case, we'd have to improve our framework to be able to handle such a special case. JC On Thu, Jul 26, 2012 at 3:57 PM, jordi castells < > wrote:" "missing properties in DBPedia stats" "uRecently generated stats sometimes show imprecise information - properties that are in the infobox template are shown as \"property is mapped but not found in the template definition\" and their occurrences are not counted. Example: Is the template definition the same as is retrieved from If yes, the last change in one of the 3 properties shown as not found in the template was 2.5 years ago - so it is not the case of recent changes that are not taked into account by the stats system. A similar issue, discussed here before: Cheers, Uldis Recently generated stats sometimes show imprecise information - properties that are in the infobox template are shown as 'property is mapped but not found in the template definition' and their occurrences are not counted. Example: Uldis uHi Uldis, The statistics (try to) capture the actual template usage which does not always is the same as the official template definition. the following entries indicate that the statistics did not find any value occurrence in lv Wikipedia for these properties (that are officially defined in the infobox) naalma_maternaapbalvojuminaperiodsdo you have any examples of these properties having a value in an article? (note that the stats were generated from a dump which is 1-2 months old) regarding the template definition, usually the templateExtractor does a good job and identifies all properties but templates are sometimes too complex and the parser fails. On Thu, Oct 15, 2015 at 9:47 AM, Uldis Bojars < > wrote: uHi Dimitris, the following entries indicate that the statistics did not find any value It does not seem the case for the infobox in question - there were multiple occurences of alma_mater (in Rakstnieka infokaste) in LV wikipedia dump from July 2015: '/{{Rakstnieka infokaste/,/}}/' - | egrep \"alma_mater\s*=\s*(\w+|[\['])\" | wc 15 112 1087 For other 2 properties: - 27 matches for \"apbalvojumi\" - 21 matches for \"periods\" Examples of pages using these properties: | alma_mater = [[Pēterburgas universitāte]] | apbalvojumi = [[TrÄ«szvaigžņu ordenis]], [[Tēvzemes balva]], [[AtzinÄ«bas krusts]] | periods = 1887—1943 P.S. It's would not be a problem if this happens just for this infobox. But the same issue might appear in stats for other languages / infoboxes too. Cheers, Uldis On 15 October 2015 at 10:05, Dimitris Kontokostas < > wrote: uI see, thanks for the report can you submit a bug in the github issue tracker? On Thu, Oct 15, 2015 at 12:17 PM, Uldis Bojars < > wrote: uDone: Uldis On 15 October 2015 at 12:39, Dimitris Kontokostas < > wrote:" "How to find the order of updates?" "uHi, I want to import all the dbpedia live updates into sesame store. I downloaded 2013, April months data. I couldn't figure out the order of n-triples that have to added and deleted. Some of them are .nt.gz files and some of them are in yyyy-mm-dd.tar.gz format and some of them are in yyyy-mm-dd-hh.tar.gz format. In what order I have to perform the updates? Thanks Vinod Hi, I want to import all the dbpedia live updates into sesame store. I downloaded 2013, April months data. I couldn't figure out the order of n-triples that have to added and deleted. Some of them are .nt.gz files and some of them are in yyyy-mm-dd.tar.gz format and some of them are in yyyy-mm-dd-hh.tar.gz format. In what order I have to perform the updates? Thanks Vinod" "DBpedia and disambiguations" "uHi all, I have noticed there are a lot of pages in the enwiki which are not detected as disambiguation pages by the DBpedia EF. This is because of the not-maintained map of disambiguation templates in [1] I am working on the suggestion which appears in [1] and parsing MediaWiki:Disambiguationspagepages to collect disambiguation template names. Anyway I have a couple of questions: 1) Should I work on DEF master branch or dump branch? The dump branch seems to be the most updated one (source of DBpedia 3.9?) Are you going to merge it back to master? 2) Since there are many differences between [1] and what we can get with MediaWiki:Disambiguationspage should we merge them or simply remove [1]? Would be great to have your opinion on this. Cheers Andrea [1] Disambiguation.scala uOn top of this: Looks like MediaWiki is currently superseding the MediaWiki:Disambiguationspage and moving to a magic word [1][2] There are already Wikipedias which have removed the MediaWiki:Disambiguationspage and transitioned to the DISAMBIGmagic word. Unfortunately this means that probably it will be needed to preprocess wikipedia dumps to extract the list of disambiguation templates by looking at DISAMBIGor any alias defined (which can be retrieved using GenerateWikiSettings.scala). WDYT? Andrea [1] [2] 2013/10/21 Andrea Di Menna < > uThe branches should be merged, of course. It's a matter of checking with Jona when. Or would Andrea be up for the challenge? Also, is this magic word within the template definitions or within the disambiguation pages? On Mon, Oct 21, 2013 at 8:26 AM, Andrea Di Menna < > wrote: uI could try but I am sure Jona knows exactly all the changes on the dump branch and could merge it more easily :-) For what regards the magic word it is required to be in the article itself, either directly or transcluded. But of course it is easier to modify the disambiguation templates other than editing each article :) Example of the change is in [1] For what regards the implementation of automatic download from the MediaWiki:Disambiguationspage I have reused some code from Jona [2] and added the relevant hooks in GeneratePageSettings.scala Will send a pull request as soon as I can. Cheers Andrea [1] [2] Il 22/ott/2013 06:37 \"Pablo N. Mendes\" < > ha scritto:" "Server performance" "uI have been using sparql.dbpedia.org for various sparql queries in a web app. Up until recently I usually get a response time of a few 100ms. However this month almost all response times have been in the range of 5-20 seconds. Is there any problem with the service, or how I'm using it? Or, is there anyway I can debug or investigate my issues? Thanks, Alan Patterson I have been using sparql.dbpedia.org for various sparql queries in a web app. Up until recently I usually get a response time of a few 100ms. However this month almost all response times have been in the range of 5-20 seconds. Is there any problem with the service, or how I'm using it? Or, is there anyway I can debug or investigate my issues? Thanks, Alan Patterson" "Public sparql endpoint maintenance" "uDear all, Since the sparql end point went down for maintenance yesterday I haven't been able to query it, even after the OpenLink twitter feed said it was back up again. All my queries return no results whereas I know they should. Any ideas on what's going on? Regards, Marieke van Erp uHi Marieke, The dbpedia sparql endpoint is online and I can query it without problems. Can you please confirm if your queries are still having problems and if so provide some samples ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 5 Nov 2010, at 07:41, Marieke van Erp wrote:" "chemistry-related infoboxes" "uHi list, Like others before me, I am interested in adding to dbpedia's parsing capabilities the power to correctly digest infoboxes that are associated with different types of chemical substances. These infoboxes are not stricly 'infoboxes', in that their names and syntax in the wiki markup is not exactly the same as for the infobox that would show up for, say, Innsbruck. Therefore, these boxes are skipped in the current system. What I am currently trying to do (and please stop me if you have already gotten beyond what I have, or have a better strategy) is to identify how many chemical subcategories (such as Household_chemicals, Solvents, etc.) have well-defined 'chemfoboxes' that could easily be sought out and parsed to create triples. My initial observation is that there are several types of chemfoboxes, so at the moment I am trying to compose a list of which categories of chemicals contain which types of chemfoboxes. When I have enough of these (categories whose members often have chemfoboxes) to cover perhaps a few hundred substances, I will then set to work trying to parse those that have chemfoboxes and ignore gracefully those that don't. I will share this when done. Your suggestions and comments are most welcome. Thanks, Christian uHi Christian, On 7/31/07, < > wrote: An overview of that would be useful to get things to use to same infoboxes. That sounds excellent. Egon uHi Christian, I suggest you start not with identifying all chemistry related categories, but looking for the different mechanisms/templates for representing chemicals in Wikipedia, like the chembox tables or Chembox_new. help. I created ChemboxExtractor.php for you, so you can start with writing code to parse articles that use the chembox \"template\". Extractors are usually processed on all Wikipedia articles, so I included a preg_match(\"{{chembox header\") at top to ensure the article uses that template before starting to parse. You can use extract.php to get single article sources from Live-Wikipedia, but if you want to try a complete extraction of all articles you will need to install a local Wikipedia MySql database. That's a bit tricky, but you can use our import.php script. It's important to configure MySql adequate, i.e. much memory to the key buffer and other caches. The import script will fail the first time, just try again, mysql needs to load everything into memory and that will cause a timeout at first. Comment files in import.php that are successfully progressed. Our extraction code is not yet documented, but that will hopefully happen the next days. Happy coding ;) Cheers, Georgi uHi, I have looked at the page you point to, And, so far as I can see, the best way to programmatically obtain as many chemicals that I can see is to use the three pages, And the titles listed in these provide links (when edited, which I have done in relatively little time using vi) to over 1900 chemicals, many of which have chemfoboxes. Unfortunately, these aren't accessible through any of the Category: links I could find, so that it's either up to us to write a custom parser just to get the chemical names, or to manually cut up the pages themselves (as I did, with vi enough shortcuts can be taken that it takes me less than 5 minutes per page). My next step will be to use your ChemboxExtractor.php to produce a file containing the bodies of all these 1900 chemicals, and then have a look at all the data fields that show up inside chemfoboxes, so I can devise a parsing approach. I already have a working local wikipedia installation, so that nightmare needn't be worried about at the moment ;-) Are my messages making it properly to the list? If so, shall I omit your email addresses when responding on this thread? Thanks, Christian Quoting Georgi Kobilarov < >: uOn 8/1/07, < > wrote: This may not be helpful, I'm not sure, but That will show things that have the infobox template only. Judson [[:en:User:Cohesion]] uwrote: It sounds as if you should work together with Do you? This seems to be one of the more active \"WikiProjects\". They have a \"collaboration of the month\" (or every 2nd month), which for July was \"liquids\". Why not suggest \"infoboxes\" as the collaboration theme for August. Note also that similar \"WikiProjects\" for chemistry exist in nine other languages as well. ucohesion wrote: See also the other boxes in the same category, Category:WikiProject_Chemistry_Templates" "Some fun reports" "uI ran some reports for an importance score against DBpedia here The importance score is proportional to how many hits there were on a topic page and all the redirects to it in Wikipedia. In the report above, the importance is the sum of importance over all units of a type (11 units of importance applied to people, about 7.7 on creative works, and 3.4 on places) If you go into a detail you see the top instances for a given type which are often amusing. Some immediate goals are to look at the cumulative importance distribution (which X% of concepts get Y% of the interest) and also run these on link-derived importance that Andreas Thalhammer published here:" "DBpediaLive - Update rate statistics" "uHi all, In changes performed in DBPedia Live during the last minute/hour/day, I would like to know if an history of this (maybe only the daily ones) is available somewhere. What I want in the end is to be able to say something like \"in june 2012, DBPediaLive had an average of xx inserts and yy deletes (if i can also have which pairs insert-delete are related as an update, would be awesome), which represents zz% of the total of triples\". If not, I think it can be done with some scripting (because I'm better in scripting than in Scala), IMHO it would be nice if each daily changeset had an \"statistics\" file like this: numofTriples at start of the day: N1 numofTriples at the end of the day: N2 numof Inserts: I num of Deletes: D num of edits: U I don't know if i can assume that if the same subject appears in a pair insert.file delete.file of the same time, is an update, or if its more complicated than that. If it is, well, inserts and deletes works for me, seeing each pair as an Delete-Insert SPARQL Operation with ground triples and simply counting for the size. What I don't know where I can find is the number of triples in the whole dataset each day. Thanks in advance. uHi Luis, On 07/13/2012 04:27 PM, Luis Daniel Ibáñez González wrote: Actually the statistics you are looking for here are not currently supported by DBpedia-Live, but we will place it in our future plans. But you urgently need them, you can look changesets that are generated daily. Those changesets are available in [1]. You can simply download the file of a specific day, decompress it, and you will find 2 sets of N-Triples files; one for added triples, and one for deleted ones. You can simply count the number of triples in each to get the required results." "Running the extraction" "uHi, I am running the extraction. It has been going on for 6hr 30 mins now and is about 20 GB. I had to stop it as I am running out of disk space. How much bigger it is going to get? Thanks,Sreeni Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1 uHi Sreeni, the size of the dumps depends on the extractors and languages chosen. The English DBpedia 3.5.1 dump is 93.1 GB. The whole DBpedia 3.5.1 dump covering all 92 languages and all extractors given in the config.properties.default is 261 GB. Cheers, Anja On Jun 3, 2010, at 3:01 AM, Sreeni Chippada wrote: uHi Anja, Thanks for your response. -Sreeni" "Announcing OpenLink Virtuoso Open-Source Edition, Version 7.1.0" "uHi, OpenLink Software is pleased to announce the official release of Virtuoso Open-Source Edition, Version 7.1.0: New product features as of February 17, 2014, V7.1.0, include: * Engine - Enhancements to cost-based optimizer - Added optimization when splitting on scattered inserts - Added optimization on fetching col seg - Added support for multi-threaded sync/flush - Added support for ordered count distinct and exact p stat - Added new settings EnableMonitor - Added BIFs key_delete_replay(), set_by_graph_keywords(), tweak_by_graph_keywords, vec_length(), vec_ref(), x509_verify_array(), xenc_x509_cert_verify_array() - Added new functions bif_list_names() and bif_metadata() - Added new general-purpose HTTP auth procedure - Added support for local dpipes - Added support for session pool - Added option to allow restricting number of id ranges for new IRIs - Added support for execution profile in XML format - Added support for PL-as-BIFs in SPARQL - Improved I/O for geometries in SQL - Fixed geo cost of non-point geos where no explicit prec - Fixed re-entrant lexer - Fixed RPC argument checks - Fixed memory leaks - Fixed compiler warnings - Treat single db file as a single segment with one stripe - Updated testsuite * GEO functions - Added initial support for geoc_epsilon(), geometrytype(), st_affine() (2D trans nly), st_geometryn(), st_get_bounding_box_n(), st_intersects(), st_linestring(), st_numgeometries(), st_transform_by_custom_projection(), st_translate(), st_transscale(), st_contains(), st_may_contain(), st_may_intersect() - Added new BIFs for getting Z and M coords - Added support for <(type,type,)type::sql:function> trick in order to eliminate conversion of types on function call - Optimization in calculation of GCB steps to make number of chained blocks close to square root of length of the shape - Fixed geo box support for large polygons - Fixed mp_box_copy() of long shapes - Fixed range checks for coordinates - Fixed calculation of lat/long ratio for proximity checks - Fixed bboxes in geo_deserialize - Fixed check for NAN and INF in float valued geo inx - Fixed check for NULL arguments - Minor fixes to other geo BIFs * SPARQL - Added initial support for list of quad maps in SPARQL BI - Added initial support for vectored IRI to ID - Added initial support for SPARQL valid() - Added new codegen for initial fill of RDB2RDF - Added new settings CreateGraphKeywords, QueryGraphKeywords - Added new SPARQL triple/group/subquery options - Added missing function rdf_vec_ins_triples - Added application/x-nice-microdata to supported SPARQL results output formats - Added support for built-in inverse functions - Added support for GEO-SPARQL wkt type literal as synonym - Added support for the '-' operator for datetime data types - Fixed issues in handling GEO predicates in SPARQL - Fixed RDF views to use multiple quad maps - Fixed issues with UNION and BREAKUP - Fixed dynamic local for vectored - Fixed Transitivity support for combination of T_DIRECTION 3 and T_STEP (var) - Fixed handling of 30x redirects when calling remote endpoint - Fixed support for MALLOC_DEBUG inside SPARQL compiler - Fixed TriG parser * Jena & Sesame - Improved speed of batch delete - Removed unnecessary check that graph exists after remove - Removed unnecessary commits - Replaced n.getLiteralValue().toString() with n.getLiteralLexicalForm() * JDBC Driver - Added statistics for Connection Pool - Fixed speed of finalize * Conductor and DAV - Added trigger to delete temporary graphs used for WebID verification - Added new CONFIGURE methods to DETs to unify folder creation - Added new page for managing CA root certificates - Added new pages for graph-level security - Added verify for WebDAV DET folders - Added creation of shared DET folders - Fixed creation of ETAGs for DET resources - Fixed DAV rewrite issue - Fixed DAV to use proper escape for graphs when uploading - Fixed issue deleting graphs - Fixed issue uploading bad .TTL files - Fixed issue with DAV QoS re-write rule for text/html - Fixed issue with user dba when creating DET folders - Fixed normalize paths procedure in WebDAV - Fixed reset connection variable before no file error * Faceted Browser - Added missing grants - Added graph param in FCT permalink - Changed labels in LD views - Changed default sort order to DATE (DESC) - Copied virt_rdf_label.sql locally - Fixed double quote escaping in literals - Fixed FCT datatype links - Fixed the curie may contain UTF-8, so mark string accordingly - Changed describe mode for PivotViewer link Other links: Virtuoso Open Source Edition: * Home Page: * Download Page: * GitHub: OpenLink Data Spaces: * Home Page: * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): OpenLink AJAX Toolkit (OAT): * Project Page: * Live Demonstration: * Interactive SPARQL Demo: OpenLink Data Explorer (Firefox extension for RDF browsing): * Home Page: Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // Weblog" "dbpedia redirects outside "en"?" "uI was looking at the Dbpedia 3.9 files today and I noticed that redirects are not available for Wikipedia outside \"en\" and I'm wondering why that is. Lately I've cooked down the wikipedia pagecounts to produce a \"3D\" data set that summarizes interest in topics (uh, hits to URIs) over the 2008-2013 time frame. The source code for this is This data has all sorts of problems, but probably the worst of them is that it is chock full of URIs that don't correspond to DBpedia concepts, for instance \"Justin_Bieber_Die_Die_Die_Die Die\", which are caused by people creating junk pages on Wikipedia that get deleted, people typing URIs wrong, etc. The obvious thing to do is to filter out topics that don't exist in DBpedia and also to resolve redirects so that people who visited \"Communists\" get credited as visiting \"Communism\" and so forth. I think a good list of valid DBpedia URIs can be had from the list of page id's in \"en\" and redirects can be gotten out of the \"transitive redirects\". En is responsible for about 1/3 of the views these days, but it would be really fun to have something that works in all culture zones so we can see what is \"big in Japan\" and so forth uOn 30 January 2014 19:23, Paul Houle < > wrote: They're available for all languages, e.g." "problem with maven-scalatest-plugin" "uHello I have a problem building the local mirror of the lookup webservice. When I run \"mvn clean install\" I got an exception on the maven-scalatest-plugin. I tried both jdk 1.7 and 1.6 and both have the same problem. Any suggestions?? Thanks Mena [INFO] [INFO]" "United_States.rdf" "uHi All I'm back form my holiday. Sorry about that :-& I can't get Any idea of why? Best, /Johs. uHi Johannes, The resource If you ask for data w/o compression and uncompressed entity body exceeds 10mb then you will get : HTTP/1.1 509 Bandwidth Limit Exceeded. Best Regards, Mitko On Apr 8, 2010, at 11:46 AM, Johannes Wehner wrote: uThank you! I'll try that. Happy Thursday! /J. uHi Mitko, Is this error by design in Virtuoso or is it just implemented for DBpedia.org ? Seems a little arbitrary to stop serving information just because it is more than 10MB, which is not that huge an RDF file really. Compression will only make it slightly smaller, and in some cases even the compressed information might be more than 10MB, how does DBpedia plan to handle really large descriptions if they ever come up? Cheers, Peter On 8 April 2010 19:44, Mitko Iliev < > wrote: uPeter Ansell wrote: Peter, This is a setting on our part which is simply enforced via Virtuoso instance settings. As for descriptions, consumers are going to have to be smarter, there are many ways to get a description bar a mass dump of everything. 10mb is very generous bearing in my we are serving the whole world. GZIP is there to be used via HTTP. Kingsley uOn 9 April 2010 09:48, Kingsley Idehen < > wrote: As long as people know about it, it is okay. Hopefully the Linked Data that DBpedia serves isn't limited in too many cases by this lack of knowledge about having to use GZIP along with HTTP to resolve Linked Data, even though it is widely available. Overall though, it is ironic that Linked Data is already running into these capacity issues given its purpose is to get mass dumps of everything that is known about single resources by performing a simple HTTP URI resolution. Someone should implement a system that lets users know what the limits are in RDF, and where they can go to incrementally get more ;)The issue won't just go away by telling people to use SPARQL instead of Linked Data. Cheers, Peter uPeter Ansell wrote: Peter, I haven't had these restrictions applied on the basis \"use SPARQL instead of Linked Data\". They are applied because DBpedia is basically a very fuzzy project (its common definition and perception don't match *our* instance construction and upkeep reality), and I am now inclined to operate within *self imposed* narrower boundaries. DBpedia is comprised of the following (sadly unbeknown to most who assume its just about #1, but consume benefits of at least #1-3): 1. RDF Data Sets" "Data reconciliation with DBpedia" "uHello everyone,   I want to consolidate a list of terms (in computer science vocabulary) with DBpedia's labels/category names. I tried LODRefine (ver 1.0.7.1) toolkit but was unsuccessful. In fact, I couldn't create a \"reconciliation service based on an RDF\" or \"reconciliation service based on SPARQL\" in the tool. How could I create such services in the toolkit and perform reconciliation accordingly?  Please let me know how to proceed. Any suggestion for better toolkit or an approach is also appreciated. Thanking you in advance. Regards, Bahram Amini SERG research group,  Universiti Teknologi Malaysia, UTM Hello everyone, I want to consolidate a list of terms (in computer science vocabulary) with DBpedia's labels/category names. I tried LODRefine (ver 1.0.7.1) toolkit but was unsuccessful. In fact, I couldn't create a \" reconciliation service based on an RDF\" or \" reconciliation service based on SPARQL \" in the tool. How could I create such services in the toolkit and perform reconciliation accordingly? Please let me know how to proceed. Any suggestion for better toolkit or an approach is also appreciated. Thanking you in advance. Regards, Bahram Amini SERG research group, Universiti Teknologi Malaysia, UTM" "TemplateDB / TemplateAnnotation" "uHi, How do I create/ populate the TemplateDB database / TemplateAnnotation tables, required for LiveMappingBasedExtractor to run? Also, where do I get new copies of mapping.csv and hierarchy.csv ? Thanks Az Hi, How do I create/ populate the TemplateDB database / TemplateAnnotation tables, required for LiveMappingBasedExtractor to run? Also, where do I get new copies of mapping.csv and hierarchy.csv ? Thanks Az" "SPARQL Query following a resource redirect" "uHi Guys, Sometimes it is possible that you query for a resource in dbpedia, but this resource now has \"changed\" and therefore it would be a redirect to new resource name. This works when you give the link directly ie in the browser: will be redirected to How can I create a SPARQL query that also does follow the redirected resource? Is that possible and can someone please give me a simple example on this simple query, since I didn't find anything about that. As far as I understood one could use \"dbprop:redirect\" but I'm just not sure how to apply it on this simple query: SELECT ?p ?o WHERE { ?p ?o } thx in advance martin uSolved, mhausenblas in irc channel #swig solved my issue. For Example: If you only want the german comment back and also want to consider a possible redirection you can use redirect like that: SELECT ?o WHERE { ?porigin ?oorigin ; dbpedia2:redirect ?redirectTarget . ?redirectTarget rdfs:comment ?o . FILTER ( LANG ( ?o ) = \"de\" ) } see you martin Zitat von Martin Kammerlander < >: uHi Guys again :) I have a second related question here. I have some resources here that I want to query that have no redirection. So the query mentioned in my last mail returns \"NULL\" So I wonder now what the best way to proceed is. is it somehow possible to tell a the query below to follow the redirection IF and only IF a redirection exists, and otherwise return the comment directly (with no redirect). Another way would be to send the normal query. Then if the result is NULL this could mean tehre is a redirection. tehn send the second Query that can follow the redirection. But I'd like to avoid that. If someone could make an example Query that would be really appreciated. best martin Zitat von Martin Kammerlander < >:" "Parsing wikipedia XML dump using extraction framework" "uHi, Folks, I am trying to run the dbpedia extraction framework offline on XML dump that I had previously downloaded. I am only interested in getting the triples for Mapping and Infobox extractors. When I run the extractor, I get an error: uSorry for this misleading message, you can ignore it. Some parts of the framework need the list of redirect pages. The redirect list is stored in a file. If the file does not exist yet, the redirects are extracted from the XML dump and the file is created. That's all this message is saying. On Feb 10, 2015 1:44 AM, \"Mandar Rahurkar\" < > wrote:" "Contact for Polish" "uDear dbPedia users, Does anyone know who I could contact regarding the Polish dbPedia chapter? It has been off-air for a number of days now. Kind regards, Astrid van Aggelen" "paragraph detection in extraction_framework" "uI'm trying to split Wiki text into paragraphs using the extraction_framework. ie val parser = WikiParser() val pageNode = parser.apply(wikiPage) when I println each node, i don't see a representation of a paragraph break. for the nodes: \"Catalonia\" and \".Andorra is a prosperous\" pageNode.children.foreach(println): InternalLinkNode(en:Bishop of Urgell,List(TextNode(Bishop of Urgell,1)),1,List(TextNode(Bishop of Urgell,1))) TextNode(, ,1) InternalLinkNode(en:Catalonia,List(TextNode(Catalonia,1)),1,List(TextNode(Catalonia,1))) TextNode(.Andorra is a prosperous country mainly because of its ,1) InternalLinkNode(en:Tourism industry,List(TextNode(tourism industry,1)),1,List(TextNode(tourism industry,1))) uI also tried to deal with this about a year ago but then I discovered that abstracts are generated with a local mediawiki installation and left it. Node subclasses implement a toWikiText() function which tries to re-calculate the wiki text (I think that they are not all correctly defined) Maybe we could try to create a toText() function that strips all wiki formatting and discards templates and (maybe) tables and compare this output with the current abstract output. This will definitely be much faster Dimitris On Wed, Oct 12, 2011 at 1:24 AM, Tommy Chheng < > wrote: uHi Tommy, You can also try to look at how DBpedia Spotlight extracts paragraphs from Wikipedia using the DBpedia Extraction Framework. Best, Pablo On Fri, Oct 14, 2011 at 2:47 PM, Dimitris Kontokostas < >wrote:" "how to discover properties of classes?" "uIs there a way to discover properties of classes of the DBpedia dataset? For instance, given the car class can some DBpedia service return manufacturer, fuel type, engine, transmission, etc ? I see 'browse class, properties' in the snorql UI, but I couldn't get an example to run. Can I expect it to return this kind of information? Or is it that one would have to get the class (perhaps with the 'a' SPARQL verb) and the data would have to reference some external RDFS? Thanks, Gustavo Frederico uHi Gustavo, there is no such user interface yet. But I'm working on one as that is a repeatedly asked feature. You can query for all properties of a given class (for example \"car\") with following sparql-query at select distinct ?p where { ?s ?p ?o . ?o rdf:type . } Cheers, Georgi uGeorgi Kobilarov wrote: Gustavo, I have some screencast re. the iSPARQL Query Builder that can help you use and learn SPARQL. The Query Builder is at: The Screencasts are at: Kingsley" "using localized uri" "uHey dbpedia team, I struggle to send query using nationalized URIs. Specifically in Czech language with diacritics. running on select ?x where { ?x 116812 . } LIMIT 100 gets me correctly to this page: but when i write this: select ?x, ?n where { ?x 116812 . ?x ?n . } LIMIT 100 or this: select ?x, ?n where { ?x 116812 . ?x ?n . } LIMIT 100 I get an empty result. What is the correct way to use URI's with diacritics? Thanks in advance, kub1x Hey dbpedia team, I struggle to send query using nationalized URIs. Specifically in Czech language with diacritics. running on kub1x uThis seems to work: select ?x ?n where { ?x 116812 . ?x prop-cs:názevTvrze ?n } LIMIT 100 Best, Volha On 5/24/2014 9:09 AM, Jakub Podlaha wrote:" "Some italian mappings problems" "uHello everyone, i've found some problems in the italian mappings, precisely in mappingbased_properties_it the problem i noticed is in the relation values, for example: < http://www.w3.org/2001/XMLSchema#nonNegativeInteger> . < http://dbpedia.org/ontology/populationTotal> \"1\"^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger> . < http://dbpedia.org/ontology/populationTotal> \"1\"^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger> . < http://dbpedia.org/ontology/populationTotal> \"385\"^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger> . < http://dbpedia.org/ontology/populationTotal> \"3\"^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger> . < http://dbpedia.org/ontology/populationTotal> \"1\"^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger> . I can't figure out why does this problem occur because, looking at the wikipedia templates, cometimes there's a point tetween the numbers, sometimes a colon, sometimes are correct: damascus: |Abitanti = 1614297 Kawasaki: |abitanti = 1 385 003 Wenzhou: |abitanti = 1.164.800 (7.558.000 la prefettura) I think someone should look into this and fix it for the next version of dbpedia :) Regards, Piero Molino Hello everyone, i've found some problems in the italian mappings, precisely in mappingbased_properties_it the problem i noticed is in the relation http://dbpedia.org/ontology/populationTotal . A lot of cities have wrong values, for example: < http://dbpedia.org/resource/Paris > < http://dbpedia.org/ontology/populationTotal > '2'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . < http://dbpedia.org/resource/Kyoto > < http://dbpedia.org/ontology/populationTotal > '1'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . < http://dbpedia.org/resource/Kyoto > < http://dbpedia.org/ontology/populationTotal > '464'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . < http://dbpedia.org/resource/Kyoto > < http://dbpedia.org/ontology/populationTotal > '990'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . < http://dbpedia.org/resource/Damascus > < http://dbpedia.org/ontology/populationTotal > '1'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . < http://dbpedia.org/resource/Kawasaki,_Kanagawa > < http://dbpedia.org/ontology/populationTotal > '1'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . < http://dbpedia.org/resource/Kawasaki,_Kanagawa > < http://dbpedia.org/ontology/populationTotal > '385'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . < http://dbpedia.org/resource/Kawasaki,_Kanagawa > < http://dbpedia.org/ontology/populationTotal > '3'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . < http://dbpedia.org/resource/Wenzhou > < http://dbpedia.org/ontology/populationTotal > '1'^^< http://www.w3.org/2001/XMLSchema#nonNegativeInteger > . I can't figure out why does this problem occur because, looking at the wikipedia templates, cometimes there's a point tetween the numbers, sometimes a colon, sometimes are correct: damascus: |Abitanti = 1614297 Kawasaki: |abitanti = 1 385 003 Wenzhou: |abitanti = 1.164.800 (7.558.000 la prefettura) I think someone should look into this and fix it for the next version of dbpedia :) Regards, Piero Molino uHi Piero, thanks for the report! Maybe you could file a bug at would be great! The number parsers are not yet fully internationalized. They always expect English number format: decimal separator is the dot '.', thousands separator is the comma ',', and spaces cannot be used within a number. This is wrong for many other languages, but explains why the population of Wenzhou is extracted as '1': we try to extract an integer from '1.164.800', and since '.' is expected to be the decimal separator, we just extract '1'. Numbers that contain spaces are parsed as multiple numbers. As for Damascus - we extracted data for Damascus from this revision: (Or a revision close to it. We really should include easily accessible links to the page revision we used) This revison used abitanti =1 614 297 Which has the same problem with spaces as the other cities. Regards, JC On Wed, Mar 21, 2012 at 17:12, Piero Molino < > wrote: uHi Piero, I was wrong - the number parsers are internationalized. The issue is the space as thousands separator. I added a bug report here: Regards, Christopher On Wed, Mar 21, 2012 at 18:15, Jona Christopher Sahnwaldt < > wrote: uSome issues are \"sticky\" and coming back over and over again :) Although we are \"technically correct\" the problem still exists Cheers, Dimitris On Wed, Mar 28, 2012 at 4:33 AM, Jona Christopher Sahnwaldt < u/me wonders if just removing spaces and stopping on the first char that is not either a number or a number separator wouldn't fix at least the most obvious ones? On Wed, Mar 28, 2012 at 9:25 AM, Dimitris Kontokostas < >wrote: uYes, that would *probably* work. I'm just worried that it will introduce new errors, and it's hard to test, because AFAIK we have far too few test cases But maybe we should just do it. Could there be a case where two numbers are separated only by space, but are two separate numbers? Well, I'm sure there are hundreds of such cases in Wikipedia, but even to a human it would usually look funny. Let me speculate: Population 67278 37473 in winter Looks weird Well, there could be list of numbers, of course. Let's say, which numbers has a football player worn in his career: numbers 3 8 6 9 Still a bit weird, but possible Hmmwe should investigate further before we decide. JC On Wed, Mar 28, 2012 at 15:14, Pablo Mendes < > wrote: uI agree that there is no single solution for this. For me the best option would be to extend the mapping language and give this option to the mapping editors on a per mapping basis. What do you think of that? Dimitris uJust to help you out i did a bit more investigation and found come other examples of number errors. Maybe they are not all coming from the \"scape between numbers\". i paste the triples (taken from the english mappings) here and comment a bit each one, hoping they will help. \"8.880349646476256\"^^ . \"8.9\"^^ . \"1.0985771268444487E12\"^^ . \"1.098581E12\"^^ . \"5.090103633243341E10\"^^ . \"5.11E10\"^^ . \"5.6593830198951935E10\"^^ . \"5.6594E10\"^^ . \"75.6760230743194\"^^ . \"76.0\"^^ . \"1585634.0\"^^ . \"2.321767472E7\"^^ . \"1.614E8\"^^ . \"1.610972604628992E8\"^^ . \"6975.0\"^^ . \"6508.524086550011\"^^ . \"13.0\"^^ . \"13.1064\"^^ . \"7.88651379597312E10\"^^ . \"7.8866E10\"^^ . \"131.66083606297406\"^^ . \"133.0\"^^ . \"6.229802001315594E11\"^^ . \"6.22984E11\"^^ . \"7.104279717181004\"^^ . \"7.1\"^^ . \"428.8536\"^^ . \"429.0\"^^ . \"1912.0104\"^^ . \"1912.0\"^^ . \"10.409314194304342\"^^ . \"10.39\"^^ . \"1.34679381737472E8\"^^ . \"1.35E8\"^^ . In the Examples above, i think the problem of the double value cames from the fact that there are 2 numbers in the template, one for miles and one for km (or other corresponding scales). The parser probably parses both, but the conversion sesults in 2 different triples, when only one should be the correct one. one idea on how to solve the problem is, if there are 2 different numbers in 2 different scales, take the international one, otherwise in there's only the local one (like miles) convert it. By the way this could lead to other problems: this is a perfect example of spaces between numbers where the second number is independent from the first one, soi don't really know how this has to be solved Here are some other erroneus triples: \"4860.0\"^^ . \"5280.0\"^^ . \"5760.0\"^^ . Multiple values, onl one is the right one. \"-130.0\"^^ . the value in wikipedia is completely different and also the datatype seems strange (and the fact that the value is negative obviously. Here is another strange triples, that have nothing to do with numbers: . . . . . . . . \"1991\"^^ . \"1994\"^^ . \"1994\"^^ . Hope this will help improving the extractor ;) Piero Molino Il giorno 28/mar/2012, alle ore 16:43, Jona Christopher Sahnwaldt ha scritto: uHi there, Thanks for the analysis. It is very helpful. that there are 2 numbers in the template, one for miles and one for km (or Rightseems to be a problem generated by rounding when doing the conversion between scales. One could just squash two numbers that are within a given \delta, or they could pick the international as you suggested. By the way this could lead to other problems: this is a perfect example of Can you point me out to one example? All examples I found have at least parenthesis or a comma. Cheers, Pablo On Wed, Mar 28, 2012 at 5:10 PM, Piero Molino < >wrote: uHi Pablo, i can't find one example, i controlled and clearly there's always a measure identifier or parentheses or commas. Sometimes there's only the measure and the whitespae before the next one, this probably led me to remember only numbers, instead of numbers+measures. So, probably finding that the value of the template is expressed in 2 different measures is a bit less difficult thanI'd expected. Hope to find this improvement in thext version of dbpedia mappings :) Cheers, Piero Il giorno 28/mar/2012, alle ore 17:17, Pablo Mendes ha scritto: uYes, that sounds great! Of course, it's a lot more effort than adding a space character to a regular expression. :-) In general, I think we could and should do a lot more of the configuration on the wiki. Actually, most of it. There are many other extractors besides the mapping based that could even be \"programmed\" on a wiki page, because the basic process is almost always the same: filter pages in a certain namespace, filter certain nodes from their content (templates, links, category links), and get some data from the nodes. But that's a pipe dream. In the shorter run, yes, it would be nice to allow adding special parsing rules to properties. Should we use template properties or ontology properties? Or maybe it would be better to configure them per language? I'm not sure. Anyway, even for such a relatively small change, it's hard to find the human resources:-( Christopher On Wed, Mar 28, 2012 at 16:58, Dimitris Kontokostas < > wrote:" "Consultation: DBpedia Spotlight Requirements" "uHello from Appstylus S.L. in Spain, I have tried adding the surface forms to index executing: *# add surface forms to index mvn scala:run -DmainClass=org.dbpedia.spotlight.lucene.index.AddSurfaceFormsToIndex \"-DaddArgs=$INDEX_CONFIG_FILE\"* with the following memory parameters: *export JAVA_OPTS=\"-Xmx4G\" export MAVEN_OPTS=\"-Xmx4G\" export SCALA_OPTS=\"-Xmx4G\"* becouse my computer only has 4Gb RAM, so after 21 hours waiting the execution stopped, getting the following error: *INFO 2011-12-22 17:05:55,090 main [AddSurfaceFormsToIndex$] - Getting surface form map java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161) at org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26) Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.regex.Matcher.toMatchResult(Matcher.java:252) at java.util.Scanner.match(Scanner.java:1287) at java.util.Scanner.hasNextLine(Scanner.java:1495) at org.dbpedia.spotlight.lucene.index.AddSurfaceFormsToIndex$.loadSurfaceForms(AddSurfaceFormsToIndex.scala:84) at org.dbpedia.spotlight.lucene.index.AddSurfaceFormsToIndex$.main(AddSurfaceFormsToIndex.scala:115) at org.dbpedia.spotlight.lucene.index.AddSurfaceFormsToIndex.main(AddSurfaceFormsToIndex.scala) *Does it means that I dont have enough heap memory? If this is so, how much memory do I need? Thanks in advance, Jairo Sarabia AppStylus developer Hello from Appstylus S.L. in Spain, I have tried adding the surface forms to index executing : # add surface forms to index mvn scala:run -DmainClass=org.dbpedia.spotlight.lucene.index.AddSurfaceFormsToIndex '-DaddArgs=$INDEX_CONFIG_FILE' with the following memory parameters : export JAVA_OPTS='-Xmx4G' export MAVEN_OPTS='-Xmx4G' export SCALA_OPTS='-Xmx4G' becouse my computer only has 4Gb RAM, so after 21 hours waiting the execution stopped, getting the following error: INFO 2011-12-22 17:05:55,090 main [AddSurfaceFormsToIndex$] - Getting surface form map java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161) at org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26) Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.regex.Matcher.toMatchResult(Matcher.java:252) at java.util.Scanner.match(Scanner.java:1287) at java.util.Scanner.hasNextLine(Scanner.java:1495) at org.dbpedia.spotlight.lucene.index.AddSurfaceFormsToIndex$.loadSurfaceForms(AddSurfaceFormsToIndex.scala:84) at org.dbpedia.spotlight.lucene.index.AddSurfaceFormsToIndex$.main(AddSurfaceFormsToIndex.scala:115) at org.dbpedia.spotlight.lucene.index.AddSurfaceFormsToIndex.main(AddSurfaceFormsToIndex.scala) Does it means that I dont have enough heap memory ? If this is so , how much memory do I need ? Thanks in advance, Jairo Sarabia AppStylus developer uApparently the only configuration that works with the maven scala plugin is the one performed within the pom.xml using the jvmArgs tag. There are examples in our repo. Best Pablo On Dec 23, 2011 11:12 AM, \"Jairo Sarabia\" < > wrote:" "foaf:page / foaf:primaryTopic in DBpedia" "uHi, I think there's a slight inconsistency in some FOAF triples published by DBpedia. For example, see these triples from : . . DBpedia basically uses foaf:primaryTopic and foaf:page as inverse properties, but the foaf spec says: foaf:page -> The page property relates a thing to a document about that thing. As such it is an inverse of the topic property, which relates a document to a thing that the document is about. foaf:primaryTopic -> The primaryTopic property relates a document to the main thing that the document is about. It is an inverse of the isPrimaryTopicOf property, which relates a thing to a document primarily about that thing. I think DBpedia should use either - foaf:primaryTopic and foaf:isPrimaryTopicOf or - foaf:page and foaf:topic What do you think? Cheers, Christopher uHi, I think there's a slight inconsistency in some FOAF triples published by DBpedia. For example, see these triples from : . . DBpedia basically uses foaf:primaryTopic and foaf:page as inverse properties, but the foaf spec says: foaf:page -> The page property relates a thing to a document about that thing. As such it is an inverse of the topic property, which relates a document to a thing that the document is about. foaf:primaryTopic -> The primaryTopic property relates a document to the main thing that the document is about. It is an inverse of the isPrimaryTopicOf property, which relates a thing to a document primarily about that thing. I think DBpedia should use either - foaf:primaryTopic and foaf:isPrimaryTopicOf or - foaf:page and foaf:topic What do you think? Cheers, Christopher uHi, DBpedia will use foaf:primaryTopic and foaf:isPrimaryTopicOf in 3.8 (not foaf:primaryTopic and foaf:page as before). Seemed the most sensible combination to me, but I'm not a FOAF expert. Please let me know soon if you think that's a bad idea. Cheers, JC On Tue, May 15, 2012 at 8:47 PM, Jona Christopher Sahnwaldt < > wrote:" "Dumps Available, and Class syntax: underscores vs. camel case" "uHi Rich, Interesting points, thanks! I guess I would argue that the appropriate place to define human readable labels for these classes would be as rdfs:labels in any class hierarchy (or similar) generated from the YAGO data, which would presumably use a different workflow for its generation (i.e. a separate processing of the YAGO dataset, which could generate labels from the source data, without having to parse out class URIs in the DBpedia URI-space to do so), if you know what I mean. I guess this comes down to following best practice for minting URIs, and relying on triples served elsewhere, and generated by other means, to define the human friendly view. Having said that, I'm not much up on the issues regarding asian languages, or how things are handled in Wordnet. Out of interest, and as a way to increase my understanding, any chance you could share some examples? That would be great. Anyway, we've produced two dumps, which are available at: and Hopefully we can reach a consensus on/by Monday, at which point it would be great if someone from the DBpedia team could drop one of these (depending on what is decided) in in place of . Cheers, Tom. On 23/06/07, Rich Knopman < > wrote:" "Freebase and dbpedia" "uHi, there is a nice blog post on how Freebase relates to the Semantic Web: In an answer to a comment Patrick Tufts from Metaweb ( and dbpedia could work together. Snip: \"tndal, you are correct. Dbpedia and Freebase are working in the same direction, but there are some differences as well around how the two projects deal with data. As an active Wikipedian, I'd like to see dbpedia and Freebase work together. I invite anyone from dbpedia to contact me (my first name AT metaweb.com) or via my Wikipedia user page: to learn more about Freebase and talk about how we can work together to give Wikipedians better tools for reviewing structured information.\" I also got a mail from Patrick. I think Freebase is a very cool tool and it has one of the best \"Semantic Web\" user interfaces I saw till now. So we should definitfly talk with Patick about cooperations. I think our data could help bootstrapping Freebase, but I would also like to see the Freebase data integrated into the dataset of the Linking Open Data project ( Cheers Chris" "LESS - Content Syndication based on Linked Data" "uHi all, On behalf of the AKSW research group [1] and Netresearch GmbH [2] I'm very pleased to announce LESS - an end-to-end approach for the syndication and use of linked data based on the definition of visualization templates for linked data resources and SPARQL query results. Such syndication templates are edited, published and shared by using LESS' collaborative Web platform. Templates for common types of entities can then be combined with specific, linked data resources or SPARQL query results and integrated into a wide range of applications, such as personal homepages, blogs/wikis, mobile widgets etc. LESS and further information and documentation can be found at: Particular thanks go to Raphael Doering (Netresearch) who performed most of the development work and to Sebastian Dietzold (AKSW) for contributing in various ways. Cheers, Sören Auer [1] [2] http://netresearch.de uOn 21.01.2010 10:10, Pierre-Antoine Champin wrote: We did ;-) - e.g. it is referenced in the related work section of our report on LESS [1]. Indeed the aims of T4R and LESS are very similar. However, LESS focuses also on sharing and collaboration on templates and it is very much aligned with the Linked Data paradigm (e.g. LESS dynamically dereferences additional resources). We were actually thinking about supporting different template languages (in addition to our LeTL) at a later stage and T4R might be an interesting candidate." "Querying for chemical properties" "uHi there, I'm writing a matlab-like open source program and I'd like to import dbpedia's information about chemical compounds. My SPARQL knowledge is very bad and I tried to build a corresponding query but it didn't work. Can anybody tell me how to get e.g. the melting temperature of hydrogen? Cheers, Manuel PS: I looked like there is no such information for water. Or is there? uIl 09/03/2010 14:54, Manuel Schölling ha scritto: If you look at Wikipedia page ( has the value you are looking for. Unfortunately current DBpedia dump does not explicit it ( page there are a lot of related instances. Anyway, as you can see in the Wikipedia page about Hydrogen, it has a specific Infobox ( On the contrary Water ( belong to any Infoboxes. If you want to ask DBpedia SPARQL endpoint for melting points, you could start with: SELECT * WHERE { { ?s ?o } UNION { ?s ?o } } The \"melting point\" property has to be present in the corresponding WIkipedia Infobox, such as for example the Drugbox one ( cheers. uIl 09/03/2010 23.56, Manuel Schölling ha scritto: Actually the two properties links mainly the same resources. With UNION you will not have duplicates. The \"ontology\" is newer in DBpedia wrt to the \"property\". \"Ontology\" is more fine-grained. Unfortunately you can't do it right now with Hydrogen. The reason is that Hydrogen does not belong to \"Drug\" template. It has a specific template, probably it is newer than the last DBpedia crawling. Probably next dump of DBpedia will contain the information you are looking for, but at the present moment it is not there (I have not found it at least). If you want to see which resources do have a melting point, simply execute the query I wrote previously, or navigate to: cheers, roberto uOk, now I got it! Thank you very much, guys! This is really cool. There even is support for i18n: SELECT ?meltingPoint WHERE { ?s rdfs:label \"Nikotin\"@de . { ?s ?meltingPoint } UNION { ?s ?meltingPoint } } Just two final questions: - This query [1] returns two melting point information about nicotine: \"-79\" -79 is there any possibility to convert the string data to integer? - What about the unit here (°C/°F/°R/K)? Is there a convention about temperatures and other units or is it possible that this may vary from resource to resource? Cheers, Manuel [1] ?query=SELECT+%3FmeltingPoint%0D%0AWHERE+{%0D%0A++++%3Fs+rdfs%3Alabel+%22Nikotin%22%40de+.%0D%0A++{+%3Fs+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FmeltingPoint%3E+%3FmeltingPoint+}%0D%0A++UNION%0D%0A++{+%3Fs+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FmeltingPoint%3E+%3FmeltingPoint+}%0D%0A}%0D%0A uIl 10/03/2010 11:32, \"Manuel Schölling\" ha scritto: If you look at meltingPoint range in the ontology ( is a double. On the contrary, in this case, the Nicotine resource is linked through dbprop:meltingPoint to an integer (you can check it at Concerning the units, in this case it is expressed in Celsius degrees. I don't know if it is a convetion to use standard units. In Wikipedia page there is both Celsius and Fahrenheit (as you can see at Cheers, roberto uNeither SELECT xsd:decimal(?meltingPoint) WHERE { ?s rdfs:label \"Nikotin\"@de . ?s ?meltingPoint } nor SELECT ?mp WHERE { ?s rdfs:label \"Nikotin\"@de . ?s ?meltingPoint ?meltingPoint xsd:decimal ?mp } works. Anybody knows what the correct syntax is? Cheers, Manuel uIl 10/03/2010 13:20, \"Manuel Schölling\" ha scritto: In SPARQL you can not convert a value from a datatype into another one. SPARQL is a Query Language (the last two letters of the acronym). If you need to convert values you have to process the results with another language (e.g. PHP, Java, Javascript, etc.). You're righthopefully this is am issue that it is gonna change. ;-) cheers, roberto uHi, Are you sure about that? At least for the FILTER statement convert functions are allowed. See [1] at the end of the paragraph: FILTER ( xsd:dateTime(?date) < xsd:dateTime(\"2005-01-01T00:00:00Z\") ) That's why I'm wondering whether these conversion are allowed in the WHERE statement, too. [1] #tests uIl 11/03/2010 11:13, \"Manuel Schölling\" ha scritto: Casting is allowed only in FILTERing (using Xpath casting functions). In fact you can query something as: SELECT ?meltingPoint WHERE { ?s rdfs:label \"Nikotin\"@de . { ?s ?meltingPoint } UNION { ?s ?meltingPoint } . FILTER( xsd:decimal($meltingPoint)) } This is valid, but you cannot say \"convert the results into another datatype\"I think. ;-) chhers, roberto uBut that's ok. All I want is that the results \"-79\" and -79 are merged by the UNION statement and this is accomplished using FILTER. Thanks Roberto! Now the my last issue is the unit of the meltingPoint property/ontology. If anybody knows if there is a convention about it, please tell me. And if you know that there is no such convention and one cannot rely on this information, please tell me as well. Cheers, Manuel uIl 11/03/2010 13:39, \"Manuel Schölling\" ha scritto: Actually FILTER does not merge results, but it drops out results that do no respect the filtering. For example, if your query is: SELECT ?meltingPoint WHERE { ?s rdfs:label \"Nikotin\"@de . { ?s ?meltingPoint } UNION { ?s ?meltingPoint } . FILTER( xsd:string($meltingPoint)) } You will not have any results. cheers, roberto uOk, now I know how this works. There are two types of ontologies: loose ones and strict ones. - When you use ontology/meltingPoint (loose) you cannot be sure about the unit of the information. - But if you use ontology/Drug/meltingPoint (strict) this information is normalized. You can look up the normalized unit of each strict ontology in May I make a proposal? Is it possible to include the unit of the strict ontologies into the dbpedia database? Thank you very much! Manuel" "DBpedia 2015?" "uHi, it seems as if you're hosting the 2015-04 dump for a while nowIs it stable yet? Did I maybe just miss the announcement? Im asking cause I'd be interested in upgrading this guide I tried to find out what datasets are loaded online via \"There is a list of all DBpedia data sets that are currently loaded into the SPARQL endpoint.\" Digging deeper into the new page i found Also, could you maybe put md5sum files in each directory to check the downloads? This would also allow me to skip re-downloading files in core-i18n after already downloading core Best, Jörn uHello Jörn, the announcement is due to arrive in the next days and should be stable at this point. Unfortunately the of dead links. We are going to remedy this matter soon. You were right in assuming that the currently loaded datasets. After a quick look at your blog: Let me point your attention to this: Since the authors are going to dockerize DBpedia endpoints, maybe you want to have a closer look. Best regards, M. Freudenberg. 2015-08-13 14:49 GMT+02:00 Jörn Hees < >:" "Incremental diff of DBpedia dataset" "uHi Roberto, Sorry, we don't have plans to provide diffs between the different releases. We are also working towards synchronizing DBpedia live with Wikipedia. When this is deployed, we will stop to have version numbers as the dataset will change in a continuous fashion. Maybe, we could then provide change notifications via a subscription model using Triplify Updates or another of the protocols currently discussed at But this are all just vague plans for the future. Cheers, Chris" "help writing a query" "uHello, What could be a query to get all the *current* countries in the world? may be, looking for the ones that dbpedia2:yearEnd is not defined. How do a filter by the \"undefined\" properties? This query has the opposite effect: it only finds out the ones that dbpedia2:yearEnd is defined. ++++++++ SELECT ?state WHERE { ?state rdf:type < ?state dbpedia2:yearEnd ?yearEnd } ++++++++ Many thanks, DAvid Hello, What could be a query to get all the *current* countries in the world? may be, looking for the ones that dbpedia2:yearEnd is not defined. How do a filter by the 'undefined' properties? This query has the opposite effect: it only finds out the ones that dbpedia2:yearEnd is defined. ++++++++ SELECT ?state WHERE { ?state rdf:type < DAvid uDavid, I don't know how to get the list of all current countries, but I can tell you how to query for countries that don't have a yearEnd: On 25 Nov 2007, at 15:44, David Portabella Clotet wrote: Step 1: Let's make yearEnd optional SELECT ?state WHERE { ?state rdf:type . OPTIONAL { ?state dbpedia2:yearEnd ?yearEnd } } Step 2: Take only those for which the ?yearEnd variable is not bound: SELECT ?state WHERE { ?state rdf:type . OPTIONAL { ?state dbpedia2:yearEnd ?yearEnd } FILTER (!bound(?yearEnd)) } Unfortunately, the list still contains countries that do no longer exist. Richard uDavid, I don't know how to get the list of all current countries, but I can tell you how to query for countries that don't have a yearEnd: On 25 Nov 2007, at 15:44, David Portabella Clotet wrote: Step 1: Let's make yearEnd optional SELECT ?state WHERE { ?state rdf:type . OPTIONAL { ?state dbpedia2:yearEnd ?yearEnd } } Step 2: Take only those for which the ?yearEnd variable is not bound: SELECT ?state WHERE { ?state rdf:type . OPTIONAL { ?state dbpedia2:yearEnd ?yearEnd } FILTER (!bound(?yearEnd)) } Unfortunately, the list still contains countries that do no longer exist. Richard uHello, The filter is working, thanks!! I have another \"simplier\" question. This query should filter out all countries with less than 10 millions inhabitants. However, it does not work. Austria, for example, reports to have 8 millions inhabitants (\"8199783\"^^xsd:Integer), but the filter does not work. +++++++++++++ SELECT ?state, ?population WHERE { ?state rdf:type . ?state dbpedia2:populationEstimate ?population FILTER(xsd:Integer(?population) > \"10000000\"^^xsd:Integer). OPTIONAL { ?state dbpedia2:yearEnd ?yearEnd } FILTER (!bound(?yearEnd)) } ++++++++++++ What can be the problem? Regards, DAvid Hello, The filter is working, thanks!! I have another 'simplier' question. This query should filter out all countries with less than 10 millions inhabitants. However, it does not work. Austria, for example, reports to have 8 millions inhabitants ('8199783'^^xsd:Integer), but the filter does not work. +++++++++++++ SELECT ?state, ?population WHERE { ?state rdf:type < DAvid uDavid, That's caused by a bug in the DBpedia extraction code. RDF's integer datatype is \"xsd:integer\", but DBpedia incorrectly uses \"xsd:Integer\" (note the upper-case I). You can sort of make this work by changing the I to lower-case in your query: FILTER(xsd:integer(?population) > 10000000). (I also simplified the filter by using an integer literal instead of the generic typed literal syntax.) But the values in the output will still be reported as \"12345678\"^^xsd:Integer, so you might run into problems when passing the result to other tools. Richard On 29 Nov 2007, at 10:29, David Portabella Clotet wrote: uGreat! Many thanks, DAvid On Nov 29, 2007 12:22 PM, Richard Cyganiak < > wrote: uHello, One very last question: DBPedia is linking to the CIA Factbook. So, for France, we have the link to a RDF version of the CIA Factbook: owl:sameAs However, all this data is not present in DBPedia, right? factbook:GDP_percapita_PPP 30100 (xsd:long) Is it possible to query (and/or filter by) this data? (linking somehow to Something like this: +++++++++++++++++++++++ PREFIX factbook: SELECT ?state, ?population, ?GDP_percapita_PPP WHERE { ?state rdf:type . ?state dbpedia2:populationEstimate ?population. ?state factbook:GDP_percapita_PPP ?GDP_percapita_PPP. FILTER(xsd:integer(?population) < \"5000000\"^^xsd:integer). OPTIONAL { ?state dbpedia2:yearEnd ?yearEnd } FILTER (!bound(?yearEnd)) } +++++++++++++++++++++++ Or do you plan to import all this data to DBPedia? Many thanks, DAvid Hello, One very last question: DBPedia is linking to the CIA Factbook. So, for France, we have the link to a RDF version of the CIA Factbook: DAvid uDavid, On 29 Nov 2007, at 11:48, David Portabella Clotet wrote: Right. It's just a link to related data. (Analogy: Wikipedia contains links to related web pages, but not the full text of those web pages.) You need additional tools to use the linked data. Not with standard SPARQL. There are several experimental tools that could help here, but that would involve a lot more effort. There's the Semantic Web Client Library [1] which was designed for this kind of scenario. Jena's ARQ query engine supports an extended version of SPARQL, which lets you use the SERVICE keyword [2] to address multiple SPARQL endpoints, so it should be possible to write a query against both the DBpedia endpoint and the World Factbook endpoint to get the desired result. DARQ [3] sort of does the same thing, but in a more transparent way. The project didn't show much activity recently though. Finally, Virtuoso has the ability to load additional data on-the-fly while answering a SPARQL query [4]. DBpedia runs on Virtuoso, so it might actually be possible to do this with the current DBpedia setup, but I never tried. I don't know if the Virtuoso feature is enabled for the DBpedia installation. Richard [1] [2] [3] [4] (look at the examples in the input:grab section)" "GSoC project idea" "uHi everyone, I've been considering a Google Summer of Code project idea for DBPedia which would use Salient Semantic Analysis (an extension of Explicit Semantic Analysis) to disambiguate named entities in DBPedia. This could be an autonomous component, or a complement to the other methods being used to label named entities. Before fleshing out this idea further, I wanted to ask the community if anyone has tried using Latent Semantic Analysis, ESA or SSA for disambiguation of entities before. Any feedback you can provide on this would be great! Thanks, Chris Hi everyone, I've been considering a Google Summer of Code project idea for DBPedia which would use Salient Semantic Analysis (an extension of Explicit Semantic Analysis) to disambiguate named entities in DBPedia. This could be an autonomous component, or a complement to the other methods being used to label named entities. Before fleshing out this idea further, I wanted to ask the community if anyone has tried using Latent Semantic Analysis, ESA or SSA for disambiguation of entities before. Any feedback you can provide on this would be great! Thanks, Chris" "Just when you thought it was safe[2000+ "countries"]" "uI took another look at the dbpedia ontology types and found something else disturbing: dbpedia has an order of magnitude more \"Countries\" than most authorities believe exist, for instance About 350 of these make it through my unicorn filter, which is still too many. Some real countries fell out (notably \"Russia\") but that's because of methodological problems on my end. It's pretty clear at this point that I'm going to have to work backwards, establishing spatial control from another source and then mapping known entities to dbpedia terms. Ugh, looks like I'm building my own taxonomy after all." "How Do with deal with the Subjective Matter of Data Quality?" "uAll, Increasingly, the issue of data quality pops up as an impediment to Linked Data value proposition comprehension and eventual exploitation. The same issue even appears to emerge in conversations that relate to \"sense making\" endeavors that benefit from things such as OWL reasoning e.g., when resolving the multiple Identifiers with a common Referent via owl:sameAs or exploitation of fuzzy rules based on InverseFunctionProperty relations. Personally, I subscribe to the doctrine that \"data quality\" is like \"beauty\" it lies strictly in the eyes of the beholder i.e., a function of said beholders \"context lenses\". I am posting primarily to open up a discussion thread for this important topic." "Announcement: Navigational Knowledge Engineering (NKE) and HANNE" "uDear Colleagues, over the last year, we have worked on a methodology called Navigational Knowledge Engineering - in short NKE. We have amounted a good deal of documents, a web demo (HANNE) , source code and images, which we publish and link to on this page: Summary: NKE is a light-weight methodology for low-cost knowledge engineering by a massive user base. Although structured data is becoming widely available, no other methodology – to the best of our knowledge – is currently able to scale up and provide light-weight knowledge engineering for a massive user base. Using NKE, data providers can publish flat data on the Web without extensively engineering structure upfront, but rather observe how structure is created on the fly by interested users, who navigate the knowledge base and at the same time also benefit from using it. The vision of NKE is to produce ontologies as a result of users navigating through a system. This way, NKE reduces the costs for creating expressive knowledge by disguising it as navigation. We would also like to steer your attention to the Web Demo [2], the Tutorial slides for the Web demo [3] and two mockups[4], which visualize how the methodology could be integrated into Wikipedia and Amazon.com . As we believe that the methodology is quite novel (please tell us in case you know something similar), we are still discussing all possible applications and implications. In particular, we are searching for suggestions, where to integrate and test our methodology next. Please feel free to contact us. Regards, Sebastian Hellmann, Jens Lehmann, Jörg Unbehauen, Claus Stadler and Markus Strohmaier (TU Graz) Links: [1] Main Page: [2] Web demo: [3] Tutorial slides: [4] Mockups: NKE#Mockups" "bogus data.nytimes.com owl:sameAs links" "uHi, there are owl:sameAs triples like these in the Needless to say that this is problematic if you do owl:sameAs reasoningIs this the right place to report things like that? What can i do to help? Cheers, Jörn uOn 9/11/14 5:58 PM, Jörn Hees wrote: Good place to report these matters. Bottom line, the New York Times Linked Data is problematic. They should be using foaf:focus where they currently use owl:sameAs. I know of fixed this in the last DBpedia instance, via SPARQL 1.1. forward-chaining. I guess I need to make time to repeat the fix. DBpedia Team: we need to perform this step next time around, if the New York Times refuse to make this important correction. Alternatively, you can make fix dump too. Either way, this is a problem that we should fix. uOn 12 Sep 2014, at 01:26, Kingsley Idehen < > wrote: I think it's a better idea to fix this in the dumps than only on one endpoint. I assume the wrong info is coming from the nytimes_links.nt.gz dump file (9678 lines). These are the double occurring data.nytimes.com URIs which link various wrong things with owl:sameAs: (I know it's a bit dirty, but the data.nytimes.com URIs are shorter than that and the 2nd column is long enough that the 47 char width never 3rd column): $ zcat nytimes_links.nt.gz | sort | uniq -D -w 47 | less . . . . . . . . . . . 1102 lines File here (18 KB): Did some quick stats: each of those URIs links exactly 2 things, so we have 551 of them which are problematic: $ zcat nytimes_links.nt.gz | sort | uniq | cut -d' ' -f1 | sort | uniq -d | less 551 lines This only leaves lines without the duplicate prefix $ zcat nytimes_links.nt.gz | sort | uniq -u -w 47 | less . . . . 8576 lines File here (194 KB): I'm not sure about the rest of that file though, given that nearly 1/10th of it were obviously wrong Cheers, Jörn uOn 9/12/14 4:58 PM, Jörn Hees wrote: Of course. My point is that when its fixed in the Virtuoso DBMS behind the endpoint, we then make a dump which becomes the replacement dataset for future efforts. Links: [1] nyt_dbpedia_mappings_fix.rq" "Sparql query vs. web interface" "uHi, I'd like to find all persons born in a given city. At query with Sparql, I only get 11. The query I use is this: select ?person where {?person } The endpoint I use is results that are shown on the web page? Regards Daniel uOn 1/28/13 3:15 PM, Daniel Naber wrote: Try: Query Definition: sparql?default-graph-uri=&qtxt;=select+distinct+%3Fperson+where+%7B%3Fperson+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FbirthPlace%3E+%0D%0A%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FG%25C3%25BCtersloh%3E%7D&format;=text%2Fhtml&timeout;=30000&debug;=on uHi Daniel, On 01/28/2013 09:15 PM, Daniel Naber wrote: The information of that resource is acquired from the German DBpedia [1]. So, simply to get the required information you should ask the following SPARQL query against the SPARQL endpoint of the German DBpedia[2]: SELECT ?person WHERE {?person dbpedia-owl:birthPlace } [1] [2] sparql uHi Kingsley, just wanted to add that sometimes. The same query: from time to time returns only 5 results (which are the ones shown here When the query returns 11 elements they are: http://dbpedia.org/resource/Simon_Gosejohann http://dbpedia.org/resource/Friedrich_Daniel_von_Recklinghausen http://dbpedia.org/resource/Theodor_Rumpel_(surgeon) http://dbpedia.org/resource/Thorsten_Stuckmann http://dbpedia.org/resource/Annabel_J%C3%A4ger By checking the first one, you can see that it has http://dbpedia.org/page/G%C3%BCtersloh as dbpedia-owl:birthPlace. But this is not shown in the http://dbpedia.org/page/G%C3%BCtersloh page as a reverse property. Do you know what is wrong? Regards 2013/1/28 Kingsley Idehen < > u0€ *†H†÷  €0€1 0 + uThanks Kingsley. Is the same reason why all the resources are not listed in the dbpedia-owl:birthPlace property? Thanks 2013/1/28 Kingsley Idehen < > u0€ *†H†÷  €0€1 0 + uHi Kingsley, Because we got a couple of similar mails some time ago, do you think that we should mention this in the DBpedia wiki as a disclaimer somewhere? (like here for instance Best, Dimitris On Mon, Jan 28, 2013 at 11:36 PM, Kingsley Idehen < >wrote: uHi, This seems to be the same question as in [1], though there was no answer to that. The fact that virtuoso doesn't always provide all the results to a query is disturbing. Is there a way to know when this happens ? If it's a processing time issue, why doesn't it just fail with a timeout exception or a warning as it does sometimes ? Regards, Julien [1] message.php?msg_id=30351258 u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uHi Kingsley, I know that you wrote about this many times in the past but people usually do not bother to search the archives, plus that the sourceforge search capabilities are not very user friendly :) We should start documenting common queries in the DBpedia wiki and this is indeed a very common one. (Although I must admit that I didn't know about the latest selectivity behaviour you described) If this is documented somewhere in the Openlink website we could also put just a link and the next time someone asks we can just provide the a link to the answer. It will save us time in the long run ;) Best, Dimitris On Tue, Jan 29, 2013 at 2:29 PM, Kingsley Idehen < >wrote: u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uHi, Sorry if I sounded complaining about the quality of service of dbpedia. I understand there is a big load on the server and that some limits must be enforced to that it would not be overloaded with big queries. My concern is how to know when a query reaches one limit so that I can split the query to get the results by chunks. As far as I saw in the mailing list messages, the causes of partial results could be about default (implicit) limits on the number of results. So I though until now that if I put explicit limit in my query I would either get the full result or an exception. But this doesn't seem to be the case in the example given by Andrea : there should be 11 results, sometimes only 5 are given, and the response is given straight away. I am also running an image of DBpedia on virtuoso and I want to understand this to know at least when it can happen. You said there is a way to know when a result is incomplete with the SPARQL Protocol, I have to look at this. Thanks, Julien u0€ *†H†÷  €0€1 0 +" "WORLDCOMP Strikes Again for the Last Time" "uI graduated from University of Florida (UFL) and am currently running a computer firm in Florida. I have attended WORLDCOMP Me and my UFL and UGA friends started a study on WORLDCOMP. We submitted a paper to WORLDCOMP 2011 and again (the same paper with a modified title) to WORLDCOMP 2012. This paper had numerous fundamental mistakes. Sample statements from that paper include: (1). Binary logic is fuzzy logic and vice versa (2). Pascal developed fuzzy logic (3). Object oriented languages do not exhibit any polymorphism or inheritance (4). TCP and IP are synonyms and are part of OSI model (5). Distributed systems deal with only one computer (6). Laptop is an example for a super computer (7). Operating system is an example for computer hardware Also, our paper did not express any conceptual meaning. However, it was accepted both the times without any modifications (and without any reviews) and we were invited to submit the final paper and a payment of $500+ fee to present the paper. We decided to use the fee for better purposes than making Prof. Hamid Arabnia (Chairman of WORLDCOMP) rich. After that, we received few reminders from WORLDCOMP to pay the fee but we never responded. We MUST say that you should look at the website The status of your WORLDCOMP papers can be changed from “scientific” to “other” (i.e., junk or non-technical) at anytime. See the comments Our study revealed that WORLDCOMP is a money making business, using UGA mask, for Prof. Hamid Arabnia. He is throwing out a small chunk of that money (around 20 dollars per paper published in WORLDCOMP’s proceedings) to his puppet who publicizes WORLDCOMP and also defends it at various forums, using fake/anonymous names. The puppet uses fake names and defames other conferences/people to divert traffic to WORLDCOMP. That is, the puppet does all his best to get a maximum number of papers published at WORLDCOMP to get more money into his (and Prof. Hamid Arabnia’s) pockets. Monte Carlo Resort (the venue of WORLDCOMP until 2012) has refused to provide the venue for WORLDCOMP’13 because of the fears of their image being tarnished due to WORLDCOMP’s fraudulent activities. WORLDCOMP will not be held after 2013. The paper submission deadline for WORLDCOMP’13 is March 18, 2013 (it will be extended many times, as usual) but still there are no committee members, no reviewers, and there is no conference Chairman. The only contact details available on WORLDCOMP’s website is just an email address! What bothers us the most is that Prof. Hamid Arabnia never posted an apology for the damage he has done to the research community. He is still trying to defend WORLDCOMP. Let us make a direct request to him: publish all reviews for all the papers (after blocking identifiable details) since 2000 conference. Reveal the names and affiliations of all the reviewers (for each year) and how many papers each reviewer had reviewed on average. We also request him to look at the Open Challenge at We think that it is our professional obligation to spread this message to alert the computer science community. Sorry for posting to multiple lists. Spreading the word is the only way to stop this bogus conference. Please forward this message to other mailing lists and people. We are shocked with Prof. Hamid Arabnia and his puppet’s activities Sincerely, Chris I graduated from University of Florida (UFL) and am currently running a computer firm in Florida. I have attended WORLDCOMP" "noun-attribute lists" "uI am looking for noun-attribute lists - something like those available from ConceptNet. Is there anything like this available from DBpedia (or any other common sense database)? Any help would be much appreciated. David Levy. www.worldsbestchatbot.com P.S. Here is an example of a conceptnet noun-attribute pair. [chicken] {AtLocation} (2) pizza (2) a freezer (1) the oven (1) eggs (1) dinner (1) chicken and noodles (1) a plate (1) a movie (1) a fast-food restaurant (1) a farm (1) a fair (0) inside an egg {CapableOf} (8) cross the road (5) produce eggs (4) lay an egg (2) hatch from eggs (2) fly (1) taste good (1) roost in trees at night (1) require water to drink (1) live as long as 14 years (1) gives eggs (1) fly well (1) fly only a few feet (1) fly a little (1) fight for her chicks (1) die to feed people (1) commonly found in chinese food (1) chew bone (1) attempt to fly (1) 'scape the coop (0) stirfried in butter (0) food (0) consumed (0) be spiced with garlic when prepared for cooking (0) be an ingridient in a soup (0) also be stored in small spaces (0) a type of food {Causes} (0) {CausesDesire} (0) {DefinedAs} (0) {HasProperty} (0) {IsA} (10) a bird (5) meat (3) white meat (3) a popular form of food (2) farm animals (2) a type of food (2) a biped (1) white (1) the smartest creatures (1) stirfried in butter (1) served with potatoes (1) seafood (1) reheated in the microwave (1) prey (1) pets (1) part of a meal (1) have friends over (1) good to eat (1) fowl (1) food animals (1) cross the road (1) animals (1) an American president (1) a versatile meat (1) a type of meat (1) a type of bird (1) a traditional ingredient in lasagna (1) a staple food of the american diet (1) a main source of food (1) a good source of protein (1) a form of food (1) a feather (1) a common type of meat (1) a cake (1) A human hand (0) meat and salad is a mixture of vegetables and they are both types of food (0) food (0) domesticated birds (0) bird that is edible by humans (0) animals whose flesh necessarily contains cholesterol (0) animals used for food (0) an avian lifeform that have been domesticated and are reproduced in large numbers so they can be slaughtered and eaten by humans (0) an animal that people eat (0) a poutlry bird (0) a popular food for Americans (0) a meat that can be cooked to prepare cuisine (0) a food, foods can be delicious (0) a food that might be served at a party (0) a common type of meat used in cooking (0) a bird that people eat the flesh of\" (0) a bird that humans raise for food and eggs (0) A hen {ReceivesAction} (0) {UsedFor} (2) food (0) produce eggs" "DBPedia ontology - how to use it?" "uDBPedia, I’m trying to reuse the “international designator” property shown at Attempt 1) \"1987-022A” doesn’t appear on the corresponding DBPedia page Attempt 2) Attempt 3) and gets me to which shouldn’t be an integer (it’s a string), and I still can’t determine what URI to use for the property. How do I get the URI for the property? Also, is there a more introductory version of I have an account and want to help, but I’ve never been able to figure out how. Thanks, Tim Lebo uHi Tim, On Sun, May 4, 2014 at 10:27 PM, Timothy Lebo < > wrote: It does but it is generated from the raw infobox extractor that greedily guesses the datatype dbpprop:cosparId 1987 (xsd:integer) You can create a new class on your own or use a similar class (not a domain expert but the most related are under the MeanOfTransportation or Device) You can (should) change the range to xsd:string the above uri should be Actually all classes / properties are under the dbpedia-owl (http: dbpedia.org/ontology/) namespace. classes start with uppercase and properties all lowercase We also have the following documents, they are kind of old but the basic mapping function didn't change For easier mappings you can use the following chrome extension developed by Andrea Di Menna https://github.com/dbpedia/mappings_chrome_extension and there is also this mailing list ;) We also welcome documentation contributions on the wiki For this specific case you should create a new mapping for the \"infobox spaceslight\" http://mappings.dbpedia.org/index.php/Mapping_en:Infobox_spaceflight Cheers, Dimtiris uHello Timothy, thanks for sharing your concerns and doubts :) On top of what Dimitris already explained I would like to add the following: 1) COSPAR property URI is actually previously shared URI) 2) cosparId property is currently used in other mappings ( ). One of those, namely is using it in an \"incorrect\" way, i.e. using the COSPAR id for the number of catalogued orbital launches in a specific year. As Dimitris said, the cosparId property range must be changed from xsd:integer to xsd:string. Would you like to make those changes? Feel free to ask any other question here on the mailinglist :) Cheers Andrea 2014-05-06 10:59 GMT+02:00 Dimitris Kontokostas < >: uHi, Dimitris. Thanks for your patience and pointers. Responses within. On May 6, 2014, at 4:59 AM, Dimitris Kontokostas < > wrote: Ah, I grepped for too much of the value. Good to see *some* connection from source to sink. Now, to help clean that up I think “Satellite” would be worthy of its own class. I’d tuck it under the appropriate DBPedia superclass, but can I also tuck it under the external class pext:ArtificialSatellite? I’m logged into but don’t appear to have edit permissions. DO I NEED MORE PERMISSIONS? Got it. I installed the extension, but http://mappings.dbpedia.org/server/statistics/en/?show=100000 doesn’t show the \"Infobox spaceflight\", which is the one that is used in http://en.wikipedia.org/wiki/GOES_7. Any idea why it’s missing from the list? (I can’t edit, DO I NEED MORE PERMISSIONS?) What’s the difference between the info box mapping page ^^ and http://mappings.dbpedia.org/index.php/OntologyProperty:CosparId ? And, why doesn’t http://dbpedia.org/ontology/cosparld resolve or point to either of these utility pages? Would be helpful to get to the control pages from the data itself (so we can reduce traffic on this list :-) Regards, Tim Hi, Dimitris. Thanks for your patience and pointers. Responses within. On May 6, 2014, at 4:59 AM, Dimitris Kontokostas < > wrote: Hi Tim, On Sun, May 4, 2014 at 10:27 PM, Timothy Lebo < > wrote: DBPedia, I’m trying to reuse the “international designator” property shown at http://en.wikipedia.org/wiki/GOES_7 (it has value \"1987-022A”). Attempt 1) \"1987-022A” doesn’t appear on the corresponding DBPedia page http://dbpedia.org/page/GOES_7 , so I can’t just copy/paste the property URI. It does but it is generated from the raw infobox extractor that greedily guesses the datatype dbpprop:cosparId 1987 (xsd:integer) Ah, I grepped for too much of the value. Good to see *some* connection from source to sink. Now, to help clean that up Attempt 2) http://mappings.dbpedia.org/server/ontology/classes/ doesn’t show “Satellite”, so no dice there. You can create a new class on your own or use a similar class (not a domain expert but the most related are under the MeanOfTransportation or Device) I think “Satellite” would be worthy of its own class. I’d tuck it under the appropriate DBPedia superclass, but can I also tuck it under the external class pext:ArtificialSatellite? http://prefix.cc/pext Attempt 3) http://mappings.dbpedia.org/index.php?title=Special:AllPages&namespace=202&from=Casualties&to=DistanceToDouglas shows “CosparId” and gets me to http://mappings.dbpedia.org/index.php/OntologyProperty:CosparId which shouldn’t be an integer (it’s a string), and I still can’t determine what URI to use for the property. You can (should) change the range to xsd:string I’m logged into http://mappings.dbpedia.org/index.php/OntologyProperty:CosparId but don’t appear to have edit permissions. DO I NEED MORE PERMISSIONS? How do I get the URI for the property? the above uri should be http://dbpedia.org/ontology/cosparld Actually all classes / properties are under the dbpedia-owl (http: dbpedia.org/ontology/ ) namespace. classes start with uppercase and properties all lowercase Got it. Also, is there a more introductory version of http://mappings.dbpedia.org/index.php/Mapping_Guide ? I have an account and want to help, but I’ve never been able to figure out how. We also have the following documents, they are kind of old but the basic mapping function didn't change https://github.com/dbpedia/extraction-framework/tree/master/core/doc/mapping_language For easier mappings you can use the following chrome extension developed by Andrea Di Menna https://github.com/dbpedia/mappings_chrome_extension I installed the extension, but http://mappings.dbpedia.org/server/statistics/en/?show=100000 doesn’t show the \"Infobox spaceflight\", which is the one that is used in http://en.wikipedia.org/wiki/GOES_7 . Any idea why it’s missing from the list? and there is also this mailing list ;) We also welcome documentation contributions on the wiki For this specific case you should create a new mapping for the \"infobox spaceslight\" http://mappings.dbpedia.org/index.php/Mapping_en:Infobox_spaceflight (I can’t edit, DO I NEED MORE PERMISSIONS?) What’s the difference between the info box mapping page ^^ and http://mappings.dbpedia.org/index.php/OntologyProperty:CosparId ? And, why doesn’t http://dbpedia.org/ontology/cosparld resolve or point to either of these utility pages? Would be helpful to get to the control pages from the data itself (so we can reduce traffic on this list :-) Regards, Tim uHi, Andrea, Thanks for your patience. I think once I figure it all out I’ll be spending more time mapping than I should :-) Thanks for pointing out the “what links here” page. I was able to get to it from (FWIW, I’d *really* like to see I’d like to make the changes myself, so that I can learn. I think I don’t have enough permissions. I’m Regards, Tim On May 6, 2014, at 8:22 AM, Andrea Di Menna < > wrote: uHi Tim, yes you need to have editor rights on the mappings wiki. Someone will enable you as soon as possible (@Dimitris?). Some answers inline. Cheers Andrea 2014-05-06 17:02 GMT+02:00 Timothy Lebo < >: Statistics are generated periodically from Wikipedia dumps. I do not remember exactly when the last stats were created, but the \"Infobox spaceflight\" should be therecould you issue a bug please? Anyway, even if it does not appear in the stats page you can still add a mapping for that Infobox (once you will have editor rights). The mapping page allows you to map templates and properties into ontology classes and ontology properties. The ontology property (class) page allows you to define an ontology property (class). uAndrea, On May 6, 2014, at 11:52 AM, Andrea Di Menna < > wrote: Eagerly awaiting privs :-) Now sure where to submit an issue or what I’d say :-) I’m comfortable just piecing together the template myself into but is there a way to use the Chrome extension without walking “edit” links from E.g., could I invoke the extension when I’m on Thanks. Hopefully it’ll be obvious once I work the example. Regards, Tim Andrea, On May 6, 2014, at 11:52 AM, Andrea Di Menna < > wrote: Hi Tim, yes you need to have editor rights on the mappings wiki. Someone will enable you as soon as possible (@Dimitris?). Eagerly awaiting privs :-) For easier mappings you can use the following chrome extension developed by Andrea Di Menna https://github.com/dbpedia/mappings_chrome_extension I installed the extension, but http://mappings.dbpedia.org/server/statistics/en/?show=100000 doesn’t show the \"Infobox spaceflight\", which is the one that is used in http://en.wikipedia.org/wiki/GOES_7 . Any idea why it’s missing from the list? Statistics are generated periodically from Wikipedia dumps. I do not remember exactly when the last stats were created, but the \"Infobox spaceflight\" should be therecould you issue a bug please? Now sure where to submit an issue or what I’d say :-) https://github.com/dbpedia/mappings_chrome_extension/issues/2 Anyway, even if it does not appear in the stats page you can still add a mapping for that Infobox (once you will have editor rights). I’m comfortable just piecing together the template myself into http://mappings.dbpedia.org/index.php/Mapping_en:Infobox_spaceflight , but is there a way to use the Chrome extension without walking “edit” links from http://mappings.dbpedia.org/server/statistics/en/?show=100000 ? E.g., could I invoke the extension when I’m on http://en.wikipedia.org/wiki/GOES_7 (ideally) or http://mappings.dbpedia.org/index.php/Mapping_en:Infobox_spaceflight ? http://mappings.dbpedia.org/index.php/Mapping_en:Infobox_spaceflight (I can’t edit, DO I NEED MORE PERMISSIONS?) What’s the difference between the info box mapping page ^^ and http://mappings.dbpedia.org/index.php/OntologyProperty:CosparId ? The mapping page allows you to map templates and properties into ontology classes and ontology properties. The ontology property (class) page allows you to define an ontology property (class). Thanks. Hopefully it’ll be obvious once I work the example. Regards, Tim uTim, 2014-05-06 18:02 GMT+02:00 Timothy Lebo < >: The DBpedia extraction framework is the project you are looking for ;) Anyway, even if it does not appear in the stats page you can still add a Great :) Nope. The extension \"activates\" as soon as you start editing a new mapping page and pre-fills the mapping text with info read from statistics. If there are no stats for the template you are mapping then the extension cannot help you :-) But I am sure you will be able to edit the mapping from scratch anyway, it just takes a bit of patience to understand the building blocks. Shout if in need." "for your information" "uHoi, I blog every now and again about DBpediaI did it againwhen you have good ideas how we can use the DBpedia to improve Wikipedia, to enrich Wikipedia let me knowwhen I understand what you are saying, I am happy to blog about it. Thanks, GerardM Hoi, I blog every now and again about DBpediaI did it againwhen you have good ideas how we can use the DBpedia to improve Wikipedia, to enrich Wikipedia let me knowwhen I understand what you are saying, I am happy to blog about it. Thanks,        GerardM DBpedia" "owl:sameAs values inconsistent with internal resource URIs" "uPREFIX dbpedia-owl: PREFIX dbpprop: PREFIX drugbank: PREFIX rdf: PREFIX rdfs: PREFIX owl: SELECT * WHERE{ { SELECT * WHERE { SERVICE {?drug rdfs:label \"Lepirudin\"; drugbank:affectedOrganism ?affectedOrganism}} } . { SELECT ?drug ?routes WHERE { SERVICE {?a rdfs:label \"Lepirudin\"@en; owl:sameAs ?drug; dbpprop:routesOfAdministration ?routes}} }} This query returns no results. However, each subquery within it does return results. The results of each subquery are not joinable by the variable ?drug because the objects in the owl:sameAs relationships across dbpedia are not the actual resource URIs, that are used in other datasets such as drugbank. These are the owl:sameAs values for * * * * freebase:Lepirudin The actual resource URI used by drugbank is . Instead of using values that redirect to the resource, dbpedia should use the values used by other datasets to represent the resource. Following such practices would make federated queries possible, and generally improve the quality of the Semantic Web. Grant Smith Grant Smith 816-588-2004 P {margin-top:0;margin-bottom:0;} PREFIX dbpedia-owl: < are used in other datasets such as drugbank. These are the owl:sameAs values for < Following such practices would make federated queries possible, and generally improve the quality of the Semantic Web. Grant Smith Grant Smith 816-588-2004 uHi Grant, When I click the link I get redirected to (3 instead of 4), so it looks like the \"resource\" URIs are not stable. There is no owl:sameAs either so the resources could be different. The redirect URI should be stable (\"Cool URIs don't change.\"). In the page that is returned, I see owl:sameAs to DBpedia drugs, but those URIs include \"www.\" which I believe is incorrect. I agree that shared URIs increase the value of different descriptions, but the Drugbank could use some changes too :) Besides, DBpedia extracts information from Wikipedia (only, AFAIK), so changes should be made there. I hope I made my point that the Drugbank uses inconsistent URIs, so if you do change Wikipedia, there may be people not agreeing to your edits for this reason. Regards, Ben On 5 March 2013 23:57, Smith, Grant M. (UMKC-Student) < > wrote: uHi Grant, in addition to what Ben mentioned, you can still get the required results using the following query: PREFIX dbpedia-owl: PREFIX dbpprop: PREFIX drugbank: PREFIX rdf: PREFIX rdfs: PREFIX owl: SELECT * WHERE{ { SELECT * WHERE { SERVICE {?modifiedDrug rdfs:label \"Lepirudin\"; drugbank:affectedOrganism ?affectedOrganism} } }. { SELECT IRI(REPLACE(STR(?drug), \" \" SERVICE {?a rdfs:label \"Lepirudin\"@en; owl:sameAs ?drug; dbpprop:routesOfAdministration ?routes.} } } } Hope it helps. On 03/06/2013 10:16 AM, Ben Companjen wrote: u0€ *†H†÷  €0€1 0 + uThank you for the responses, these should be useful. Grant Smith 816-588-2004 From: Kingsley Idehen [ ] Sent: Wednesday, March 06, 2013 6:15 AM To: Subject: Re: [Dbpedia-discussion] owl:sameAs values inconsistent with internal resource URIs On 3/6/13 5:08 AM, Mohamed Morsey wrote: Hi Grant, in addition to what Ben mentioned, you can still get the required results using the following query: PREFIX dbpedia-owl: PREFIX dbpprop: PREFIX drugbank: PREFIX rdf: PREFIX rdfs: PREFIX owl: SELECT * WHERE{ { SELECT * WHERE { SERVICE {?modifiedDrug rdfs:label \"Lepirudin\"; drugbank:affectedOrganism ?affectedOrganism} } }. { SELECT IRI(REPLACE(STR(?drug), \" SERVICE {?a rdfs:label \"Lepirudin\"@en; owl:sameAs ?drug; dbpprop:routesOfAdministration ?routes.} } } } Great SPARQL-FED utility showcase :-) Kingsley Hope it helps. On 03/06/2013 10:16 AM, Ben Companjen wrote: Hi Grant, When I click the link I get redirected to (3 instead of 4), so it looks like the \"resource\" URIs are not stable. There is no owl:sameAs either so the resources could be different. The redirect URI should be stable (\"Cool URIs don't change.\"). In the page that is returned, I see owl:sameAs to DBpedia drugs, but those URIs include \"www.\" which I believe is incorrect. I agree that shared URIs increase the value of different descriptions, but the Drugbank could use some changes too :) Besides, DBpedia extracts information from Wikipedia (only, AFAIK), so changes should be made there. I hope I made my point that the Drugbank uses inconsistent URIs, so if you do change Wikipedia, there may be people not agreeing to your edits for this reason. Regards, Ben On 5 March 2013 23:57, Smith, Grant M. (UMKC-Student) < > wrote:" "lookup service refcount oddity" "uHello DBpedia people, For two slightly different querystrings the lookup service returns the same resource, but with different refcounts and different redirects. Bug or feature? querystring=antique&maxhits;=6&queryclass;= querystring=antiques&maxhits;=6&queryclass;= They both have ' result, but once with a refcount of 1596 (antiques) and once with 500 (antique). Thanks, John This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. uHi John, Not a bug, so probably a feature. If the lookup used a Wikipedia redirect to find the query result, then it's listed in the \"Redirects\" field. The \"Redirects\" is not a list of all redirects. In the past (before adding redirects to the system), the refcount used to simply be the number of inlinks for a resource. But that's not the case for \"redirected\" results: There, the refcount is the number of inlinks for the *redirect* plus a function of the number of inlinks for the resource (something like inlinks(redirect) + log2(inlinks(resource))*100). This way, both the redirect itself and the according resource are weigthed in combination. That's because Wikipedia redirects can't be always threaded as synonyms. There was an example of \"XYZ_(Bill_Clinton's_dog)\" redirecting to \"Bill_Clinton\" in Wikipedia. There, you want to assign a weight to the redirect itself, or else Bill Clinton would show up as result for the keyword \"dog\" Hope that explanation makes sense. Best, Georgi" "Beginners Question" "uIf I have a URI for an Entity page (e.g. How can I tell what Type of Entity it is without actually parsing the page itself? I mean, I need to know that before I can extract information right? uYou could download the dump Ontology Infobox Types from So you know the type before requesting more data via content negotiation. You actually meant: or right? I was wondering what you mean with \"parsing the page\" . Sebastian On 01/10/2012 01:06 AM, Col Wilson wrote:" "Game based data extraction" "uHello sir How can I extract game based data from dbpedia using Sparql query? Thanks Regards Hello sir How can I extract game based data from dbpedia using Sparql query? Thanks Regards uCan you be more specific? what kind of game data are you looking for? On Sun, Feb 7, 2016 at 11:21 PM, kumar rohit < > wrote: uThese would be a good place to start On Sun, Feb 7, 2016 at 11:30 PM, kumar rohit < > wrote:" "DBpedia 3.8 dump files" "uHi, i'm currently deciding on what of the DBpedia 3.8 dumps to load into our local mirror… (and updating I'm a bit clueless about three files on the download that aren't explained, so maybe someone could shed some light on them … (I already checked - Probably just some leftover from testing? snip \"Name\"@en . \"Alt\"@en . \"n\"@en . \"v\"@en . \"name\"@en . /snip - - It seems they contain category pages which have an associated main article. As such very interesting, but it seems they use the deprecated skos:subject while all other files use dcterms:subject, and the \"redirect resolving\" wasn't done for the subjects: snip . . /snip And now with redirects resolved (watch the \"_(TV_series)\" disappear in the object position but not the subject): snip . . /snip Are there other places where this might cause \"dangling subjects\"? Despite these small issues it's actually quite an interesting dataset, any other reasons why it wasn't loaded? Why not map it to a \"dpo:categoryMain\" property. (dcterms:subject is already provided by article_categories and foaf:isPrimaryTopciOf wouldn't be correct as it would imply one of them being a document and it would cause problems as in older versions where people used it to get the corresponding wikipedia page for a topic and unintentionally get the category back.) Cheers, Jörn uJörn, Despite these small issues it's actually quite an interesting dataset, any This dataset is part of the NLP series of datasets we've been generating within the DBpedia Spotlight team. See: Since they are a bit different from the \"standard\" datasets, they were not loaded. But if the community requests, I'm sure it could be done. Why not map it to a \"dpo:categoryMain\" property. Mostly because it's a wikipedia-specific term. We were looking for something more generic, but didn't find anything that pleased everybody, so we picked one and waited to see if it would generate interest in the community. :) I'm ok with changing it. This is a relationship between an \"entity\" (a Wikipedia-independent concept) and a \"category\" (a Wikipedia-specific concept), so we might not be able to abstract too far away from Wikipedia anyhow at this point. Cheers, Pablo On Thu, Oct 25, 2012 at 8:33 PM, Jörn Hees < > wrote: uOn Thu, Oct 25, 2012 at 8:33 PM, Jörn Hees < > wrote: These datasets are needed for the mapping statistics [1]. The name \"test\" has historical reasons and doesn't really make sense anymore. The RDF triples aren't really subject-predicate-object, they just list the templates and properties that are used on Wikipedia pages. Cheers, JC [1]" "query always times out" "uHi all, Has anyone an idea how to optimize a query like this one: SELECT DISTINCT ?order ?label WHERE { ?species rdf:type dbpedia-owl:Species . ?species dbpedia-owl:order ?order . ?order rdfs:label ?label FILTER(LANG(?label) = \"en\") . } I'm trying to get the order attribute of all species but the query always times out at the dbpedia SPARQL endpoint. Removing the FILTER constraint or DISTINCT makes the query work but also retrieves a lot of (in my case) irrelevant results. Thanks for help. Regards, Sören uSören Brunk wrote: uThanks for the tip. Unfortunately I'm still getting a timeout with that query when using LIMIT and OFFSET. Regards, Sören uSören Brunk wrote: Soren, Using the LOD Cache instance at (which also has DBpedia 3.5 loaded): Graph IRI: SPARQL Query Results URL: The timeout setting for dbpedia.org is 300 seconds, and the infrastructure behind it is smaller than the LOD Cloud Cache instance. uThank you, using the LOD instance works most of the time, although sometimes I'm getting a proxy error (Reason: Error reading from remote server), not sure if this indicates a timeout on the db server or something else. But now another issue shows up: Is it possible that for some reason the dbpedia owl ontology isn't included in the LOD instance? ASK WHERE {?class rdf:type owl:Class} returns false here while it returns true at the dbpedia.org/sparql endpoint. Regards, Sören uI found another solution for my problem by using a subquery. The query below seems to produce the same results as the original one, but executes much faster (making it work at the endpoint). SELECT ?order ?label WHERE { ?order rdfs:label ?label FILTER(lang(?label) = \"en\") . {SELECT DISTINCT ?order WHERE { ?species rdf:type dbpedia-owl:Species . ?species dbpedia-owl:order ?order . }} } LIMIT 50 Regards, Sören" "Modelling Provenance of DBpedia Resources Using Wikipedia Contributions" "uHi, together with Alexandre Passant from DERI Galway we were working on provenance of DBpedia resources. We extract provenance information from Wikipedia and then we semantically represent it and expose it as Linked Data. An article about this work has just been published online on the \"Journal of Web Semantics: Science, Services and Agents on the World Wide Web\", here is the link to it [1], and here is the abstract: Modelling Provenance of DBpedia Resources Using Wikipedia Contributions \"DBpedia is one of the largest datasets in the linked Open Data cloud. Its centrality and its cross-domain nature makes it one of the most important and most referred to knowledge bases on the Web of Data, generally used as a reference for data interlinking. Yet, in spite of its authoritative aspect, there is no work so far tackling the provenance aspect of DBpedia statements. By being extracted from Wikipedia, an open and collaborative encyclopedia, delivering provenance information about it would help to ensure trustworthiness of its data, a major need for people using DBpedia data for building applications. To overcome this problem, we propose an approach for modelling and managing provenance on DBpedia using Wikipedia edits, and making this information available on the Web of Data. In this paper, we describe the framework that we implemented to do so, consisting in (1) a lightweight modelling solution to semantically represent provenance of both DBpedia resources and Wikipedia content, along with mappings to popular ontologies such as the W7 – what, when, where, how, who, which, and why – and OPM – open provenance model – models, (2) an information extraction process and a provenance-computation system combining Wikipedia articles’ history with DBpedia information, (3) a set of scripts to make provenance information about DBpedia statements directly available when browsing this source, as well as being publicly exposed in RDF for letting software agents consume it. \" It would be great to have comments and suggestions from you! Best, Fabrizio [1] j.websem.2011.03.002 uInteresting stuff, but to see anything other than the abstract would cost me $39.95. If you could send me a (p)(r)eprint I could definitely share my thoughts. In semantic work, you've got to evaluate the usefulness of something against particular use cases, so I'll bring up two examples of provenance in DBpedia-derived work that came up recently: (1) I got a complaint from a user who found a highly offensive slur in an abstract that came from DBpedia. In this case I think a student at a high school wrote something bad about the principal. Certainly we could trace this from the Wikipedia history. I like to be proactive when problems turn up, so it would be nice to be able to find other texts that were touched by the same person in case he's a serial vandal. (2) It's a bit out of DBpedia's scope, but I was contacted by a person who wanted to use an image in a book and needed a higher resolution image than was available from Wikimedia Commons. In this case the only provenance information that was there (in semistructured form) told us the image was public domain because it was scanned out of the old book. The provenance information that I wanted was the identity of the old book so these people could have found the book in a library and scanned it themselves. In this case the information just wasn't there and the only route forward would be a perscomm to the person who uploaded it. uHi, thanks for pointing that out, you can download the paper also from here: You're right, the first example you provided is exactly one of the reasons for our work presented in this paper. I think your second example is more related to the kind of metadata that Wikimedia provides (or should provide) on its MediaWiki wikis (e.g. Wikipedia). Thanks, Fabrizio On 19/05/11 18:42, Paul Houle wrote: uHi, On 24/05/11 01:32, Paul Houle wrote: Thanks! That's right, applying this work to Wikipedia entirely using the Wikipedia API would be a problem, as you have seen we did that only on a subset of the WIkipedia articles. In that case using the full dump of Wikipedia would speed up the process considerably. In our case the solution adopted was quicker. As regards the huge amount of triples that would have been generated, in the provenance research field this is still described as one of the \"major technology gaps in the state of the art\" (See the W3C Provenance Incubator Group Report [1]). 7 billion triples in the resulting dataset, after running the provenance extraction process on the full Wikipedia dump, can be a problem indeed. However, there are some working implementations of triplestores serving more than 10 billion triples [2]. In addion, we should consider advances in RDF storage with new triple stores that may handle such amount of data, and clustering architectures to do so. At the moment I think it would be nice to have provenance information in DBpedia describing only the latest version of the Wikipedia article that generated the corresponding latest triples on DBpedia. Without the record of all the previous edits/versions. Publishing provenance information in RDF is useful especially for building tools and applications capable of analysing and reasoning on this data. One example can be to provide trust and quality metrics about Wikipedia articles and their editors. Or in case of conflicting statements from different sources being able to choose the preferred source, etc. Not sure this answers your question Thanks for the feedback! Fabrizio [1] [2] LargeTripleStores uOn 5/25/11 3:43 PM, Fabrizio Orlandi wrote: LOD Cloud cache at: has 23 Billion Triples and counting. It exists to showcase what's real today re. state of the art in bigdata realm. Links: 1. linked_data_demo" "Missing links among categories?" "uDear all, I'm trying to derive a tree of categories from DBPedia but as I get so many top categories I have taken a deeper look into the data and compared it with the Wikipedia counterpart. It seems that there are many missing broader links among categories, at least from what there is in Wikipedia. For instance, compare: In DBPedia, there is just a broader link from no outgoing broader or narrower link. In Wikipedia, there are 10 subcategories (from 1910 births to 1919 births) and two supercategories (1910s and 20th-century births). Has someone experimented similar issues? Is there another available dump of the Wikipedia categories using SKOS? Any help appreciated, Best, Roberto García ~roberto uHi, 2011/1/14 Roberto García < >: The DBpedia extraction framework operates on the wiki source code of Wikipedia. For category pages, it extracts skos:broader relations using links to other category pages. This works well for example in this case: Unfortunately, it is difficult for the extraction framework to deal with all the different templates that Wikipedia offers its contributers. Looking at the wiki source code of almost the complete page is created by two templates ({{birthdecade|}} and {{Commons cat|}}). Therefore, the extraction does not find many links to other categories and cannot produce the desired data. I know this does not provide a solution to your problem, but perhaps a better understanding. Cheers, Max" "occasional "syntax error" on public sparql endpoint" "uHi all, I'm getting occasional errors in submitting (programmatically) a fixed query to dbpedia public sparql endpoint ( The issue is the same even if I submit the query using my browser. Here's the query: construct{?s ?p ?o} where { {?s ?p ?o . values ?o { }} union {?s ?p ?o . values ?s { } filter isUri(?o)} } limit 100 What am I missing? Bests regards uHi Riccardo, On 03/06/2013 08:13 PM, Riccardo Porrini wrote: Please try the following query: CONSTRUCT {?s ?p ?o} WHERE { {?s ?p ?o . FILTER ( ?o = ) } UNION {?s ?p ?o . FILTER ( ?s = )} } LIMIT 100 uWorks nice. Perfect. Thank you On Wed, Mar 6, 2013 at 8:27 PM, Mohamed Morsey < > wrote:" "what areas of knowledge has the best quality and coverage in dbpedia?" "uHi everyone! What topics/categories have the best quality of the data? What topics are covered better than other? Are there any analysis about that or maybe your personal feelings?" "Navbox templates" "uHi, Great to see some of the updates made to DBpedia 3.2. It still however strikes me that there is no data available for any of the Navbox templates found at the bottom of many Wikipedia articles (for example, the \"The Lord of the Rings\" and \"Berlin\" sample resources). To me at least this sort of information would be hugely useful to have in RDF form. Perhaps you have simply not gotten around to implementing this feature yet, or is there a specific reason why Navbox data is ignored? Alex uCould anyone clarify this for me, please? Alex uDear Alex, Alex wrote: To give you a very late reply: We simply did not get around to implementing this. I agree that the Navbox template contains a lot of information. I am unsure whether the standard infobox extraction would produce useful results there. It's also interesting to note that some of the navbox subjects seem to be useful class candidates (often with a complete enumeration of all their instances), which could be connected to the DBpedia schema. However, this isn't always the case. Kind regards, Jens PS: In case you are strongly interested in getting this information in DBpedia, I could give you some advice on how to write a DBpedia extractor (which is not too hard). uHi Jens, Thanks for the response. To be honest I'm in no real hurry, though it would still be nice to see Navbox templates incorporated into DBpedia sooner rather than later. Now, if you're willing to give me a few guidelines as to how the Navbox extractor should integrate with DBpedia, along with some general advice on how to write an extractor, I would be glad to have a go at implementing it. I could even experiment with connecting it to the DBpedia ontology, though as you mention there may only be limited success here. If I submitted a stable version of the extractor to you at some point, would you perhaps be able to release an update to DBpedia within a relatively short time? Regards, Alex uHello Jens, I was wondering if you could provide me with just a few tips on writing the Navbox template extractor that I described in my first post. There seems to be a brief guide to the extraction framework at helpful, especially regarding the integration with the existing parts of the framework/database. Thanks, Alex uJens (or anyone else who might be able to help), Sorry to bother you againI was still wondering if you could give me some recommendations for creating the Navbox template extractor, as explained in my previous posts. I am quite keen to write such an extractor, but suspect it will be a significant amount of work and would like to know how to approach the task. Alex From: Alex [mailto: ] Sent: 09 January 2009 14:44 To: ' ' Subject: Re: Navbox templates Hello Jens, I was wondering if you could provide me with just a few tips on writing the Navbox template extractor that I described in my first post. There seems to be a brief guide to the extraction framework at helpful, especially regarding the integration with the existing parts of the framework/database. Thanks, Alex uHello Alex, Alex wrote: Sorry for the very long delays. Here are some steps you need to take (it is not hard, but please aks me if I forget to mention intermediate steps): * checkout the latest DBpedia source code from Subversion [1] * create a file $YourExtractor in /extraction/extractors, which implements the Extractor interface - the MediaWiki markup of an article page is passed to your extractor, so the implementation usually involves pattern matching etc. - the result is a set of RDF triples * to test your extractor run /extraction/extract_test.php - before you do this, you'll have to specify article and extractor name in the file - the article will automatically be downloaded and your extractor will be executed within the framework [2] - watch whether the returned RDF triples are what you expect The PHP commands can/should be executed on the command line. If you want to commit your extractor to DBpedia, please make sure to test it on a sufficient number of articles first [3], then send me a message and I can grant you the rights to commit it to SVN. Thanks a lot for your interest in contributing to DBpedia! Kind regards, Jens [1] [2] [3] As an alternative option for more complete testing, you can also download the latest Wikipedia dumps via /importwiki/import.php and then run your extractor on all articles using /extraction/extract_dataset.php." "getting countries with their labels" "uHello DBpedia experts, I have a simple question. Which dataset do I have to download from the PREFIX rdfs: PREFIX rdf: SELECT ?countryRef ?country WHERE { ?countryRef rdf:type . ?countryRef rdfs:label ?country . FILTER(langMatches(lang(?country), \"en\")) } LIMIT 10 If I run this query using the Thanks, Julius Chrobak mingle.io - Query API for Open Data Hello DBpedia experts, I have a simple question. Which dataset do I have to download from the Data uHi Julius, On 06/07/2013 09:40 AM, Julius Chrobak wrote: In order for your aforementioned query to return the required results you should load at leastthe following datasets: 1. Ontology Infobox Types [1], 2. Titles [2]. [1] [2] labels_en.nt.bz2 uthank you very much Mohamed for your help! On Jun 7, 2013, at 10:40 AM, Mohamed Morsey wrote:" "Data returned on dbpedia.org/ontology/" "uHi group, We work on some software which heavily relies on the ontologies used by the data. This means we dereference the ontologies used on data sets and do some inference to figure out additional stuff about the data. For most ontologies this works pretty well. Last week we were test driving our software against some data at DBPedia, namely the page of Tim Berners-Lee at So far so good, in there we have several rdf:type definitions, including dbpedia-owl:Person, which points to On that point we noticed that it took way too long to get the page, cache it and do some stuff on it. So we started analyzing it and did it by hand: % curl -I -H \"Accept: application/rdf+xml\" HTTP/1.1 303 See Other Date: Mon, 21 May 2012 19:00:08 GMT Content-Type: application/rdf+xml Connection: keep-alive Server: Virtuoso/06.04.3132 (Linux) x86_64-generic-linux-glibc25-64 VDB Accept-Ranges: bytes Location: Content-Length: 0 Not a problem, the system can handle redirects. So we get the other file instead. And boy were we confused: It returns an 8MB file for the request (which took quite some time to get btw) After analyzing it in rapper I figured out that we got about 50'000 triples, probably less than 20 are really related to the ontology and the rest is stuff like: a . While I do see that this \"reverse property\" or however it is called might be interesting when I browse the data set in my web browser it is in my opinion plain wrong to return it on the URI which dereferences the ontology. Our software is also targeted at smart phones, you can imagine that it is not really fun to get 50'000 triples back on a crappy 3G link with volume limits and then parse and cache them on a device which is running on battery power. If I do that on several dbpedia data sets I'm probably out of power very soon and didn't even get half of the ontologies used in the data. What is your opinion on that? Is there a good reason for this or did you just think it might be useful? As you can see this pretty much kills the way we use ontologies and I think the \"classical\" way to dereference ontologies makes way more sense, so I would vote to change this behavior on dbpedia and return uniquely the definition itself. thanks cu Adrian u+100! Separation of T-Box and A-Box descriptions seems quite a reasonable requirement, in particular when there are so many instances! Or does it mean that the only way to describe the class \"Person\" is in extension : nobody can provide a definition of what a Person is, but everybody knows when she meets one :) Bernard 2012/5/30 Adrian Gschwend < > uSeparation of schema and instances is a good solution for this symptom. The root of the problem seems to be that when the amount of data grows, \"just returning everything you know about something\" is not going to cut it. A solution for that seems to be still missing. Perhaps we need a mechanism of views for linked data u0€ *†H†÷  €0€1 0 + uOn 30.05.12 16:24, Kingsley Idehen wrote: Hey Kingsley, wow definitely didn't expect that anytime soon :-D Thanks a lot! Will test-drive it tonight. great tnx! I hope this will make the services a bit more responsive. cu Adrian uOn 5/30/12 10:32 AM, Adrian Gschwend wrote: When we get feedback we can act quickly. We have a powerful and highly configurable Linked Data platform at our disposal :-) Yes, for clients that understand how to exploit HTTP . Kingsley uOn 30.05.12 16:37, Kingsley Idehen wrote: great will definitely report more when we run into issues :-) cu Adrian uHi Kingsley, some more feedback The following data redirections doesn't work on ontology resources and microdata/json doesn't work on normal resources either Cheers, Dimitris On Wed, May 30, 2012 at 5:37 PM, Kingsley Idehen < >wrote: u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uGreat! That was a quick fix too. The microdata/json seems to be a virtuoso bug though, the same error is returned from the SPARQL endpoint too Cheers Dimitris On Fri, Jun 1, 2012 at 8:31 PM, Kingsley Idehen < >wrote:" "Querying data sets on dbpedia" "uHi, I need to know how to just query a specific DBPedia data set of the many mentioned here getting any success using the from clause. Is there a way to do that? Hi, I need to know how to just query a specific DBPedia data set of the many  mentioned here that? uHello Sid, AFAIK we don't expose the data in named graphs, so you can't query the public sparql endpoint for triples from a particular dataset. At least not that I was aware of. Kingsley? Best, Georgi uGeorgi Kobilarov wrote: or Kingsley uKingsley Idehen wrote: (asking on sid's behalf since I was trying to help him/her out on IRC) I don't see anything listed there that leaps off the page as being restricted to the infobox triples, which I believe is the particular dataset that sid is interested in. The URIs on content-oriented than dataset oriented. Lee uLee Feigenbaum wrote: The first level is the Graph Group, then within the Graph Group you have IRIs. I am assuming he is seeking specific graph names? Kingsley uWell when i query a URI in SPARQL I generally get all the \"things\" that are related to the resource(unless I really narrow it down). It would be nice that if I am loking at infoboxes for resources like LINKED to that URIIf i want only infobox data for this reosurce only, then querying then mentioning the infobox data set clause would give me that(Which is not the same as specyfying a RDF graph since datasets arent really there I presume and have been converted to RDF) The only way to do this right now is to download some data in the local HDD and convert it to RDF and query the data :)Is there a simple fix to just narrow this and get what I want?(Perhaps an extra attribute related to each resource that tells me about the dataset also? and something like in the SPARQL END POINT ) Thanks and regards On Fri, May 15, 2009 at 10:57 AM, Kingsley Idehen < >wrote: uSid wrote: Are you asking for a URI like: clearer, this is a database, and it has a query language, and so you should be able to query and filter (including inference rules exploitation). Kingsley uKingsley Idehen wrote: Seems clear to me :) Sid is asking how he can query one of the specific datasets listed at the URI in his first message: without getting triples from some of the other datasets. (In sid's case, he's interested only in triples coming from the infobox part of the dataset.) Lee uLee Feigenbaum wrote: guidelines re. data partitioning with the Quad Store. Naturally, this guideline wasn't applied to the live DBpedia instance since its quite old. More recent loads (e.g. the lod.openlinksw.com instance) are based on this approach (but not functional it seems). We should have: Group: Graph IRI: I am not currently seeing that in the VoiD graph, so I'll look into what's amiss here re. external access to this data via SPARQL (worst case DESCRIBE should reveal) . You should be able to get at the Graph Group and the Graph IRIs (within groups) . Once this is resolved, what you see in the lod instance will make its way to DBpedia live after the next update, and DBpedia2 in the interim. Kingsley uLee got it rightKingsley: So the final thing is thisI state again to make it exactly clearAs an exampleThe data set infobox information extracted from various wikisSo If i query in SPARQL with : and specify the data set in the triple to be Graph IRI: then I should get the triples only from that data setThis should be REALLY useful users will not not have to do a LOT of extra processing! Regards Sid On Fri, May 15, 2009 at 2:32 PM, Kingsley Idehen < >wrote: uOr specify the GRAPH IRI in a way that is convienient to implement in the dbpedia release On Fri, May 15, 2009 at 3:20 PM, Sid < > wrote: uSid wrote: Yes, and that's what's supposed to be in place re. DBpedia2 and LOD prior to DBpedia Live. The problem is that we hadn't actually performed the DBpedia reloads on DBpedia2 and LOD that make this happen. The other interesting deliverable (now that I've been able to look into this properly) is that the source files we are preparing also include xxx.graph files that also expose Graph IRIs. Kingsley uLee Feigenbaum wrote: Sid, See this staging edition of DBpedia at: represents a Named Graph under the Graph Grp. , so is the stats associated with Named Graph: If you go via /fct, you can also use the \"statistics\" link to expose all the Named Graphs that reference an Entity, ditto its soure Graphs. Kingsley" "gone?" "uWhat happened to It now redirects to . uHi Vladimir, As just reported on another thread, the VMs hosting the lookup service as well as the mappings wiki have a hardware problem since Monday. The administrators of the university are already trying to restore them. Best, Dimitris On Wed, Dec 9, 2015 at 11:40 AM, Vladimir Alexiev < > wrote:" "Has DBpedia an internal limit for the SPARQL OFFSET value configured?" "uHi, I'm querying the DBpedia SPARQL endpoint using TopBraid Composer. Since the amount of data I'm fetching is quite big, I use an offset value to limit the output. This works quite well up to an offset value of 9000 in my SPARQL query. ORDER BY ?name LIMIT 1000 OFFSET 9000 This is the last value where I get any result. When I enter an offset value of 9001, no triples are fetched anymore. This is quite annoying, for when I don't use an offset, TBC/Eclipse collapses under the sheer amount of data that is being returned (> 40000). So my question is: Do you have an internal limit for the offset value configured? And if so, is there a workaround available to get queries working that include a higher offset value? Best regards, Martin uHi Martin, Do you encounter this limit executing the same or similar query directly using the SPARQL Query page of the DBpedia SPARQL endpoint at: If it does then please provide a sample query we can run against the end point to be looked into. Best Regards Hugh Williams Professional Services OpenLink Software On 14 May 2008, at 09:16, Martin Becker wrote: uMartin Becker wrote:" "Best practice mapping redirected templates?" "uHello, what is the best way to manage the following situation? a. The template [1] redirects to the template [2] b. The template [2] contains many parameters, so the mapping [3] is large c. If a mapping for [1] exist, the mapping for [2] will not be used (as indicated in [4]) d. The mapping [5] for the template [1] is required, because the correct classification with conditions is not possible in every situation. The below SPARQL-Query (for such cases. Is copying the parameter mapping from [2] to [1] the only way to resolve this problem (that would be difficult to maintain) or is there a way to inherit parameter mappings? PREFIX dbpedia-owl: PREFIX dbpprop: PREFIX foaf: SELECT * WHERE { ?lake a dbpedia-owl:Lake . ?lake dbpprop:wikiPageUsesTemplate . ?lake foaf:isPrimaryTopicOf ?wikilink. FILTER NOT EXISTS {?lake dbpprop:lakeName ?lakeName }. FILTER NOT EXISTS {?lake dbpprop:imageLake ?imageLake }. FILTER NOT EXISTS {?lake dbpprop:lakeType ?lakeType }. FILTER NOT EXISTS {?lake dbpprop:altLake ?altLake }. FILTER NOT EXISTS {?lake dbpprop:captionLake ?captionLake }. FILTER NOT EXISTS {?lake dbpprop:lakeName ?lakeName }. } [1] [2] [3] [4] [5] Mapping_en:Infobox_lake uHi Jan! exist, the It's the other way around: the redirected template is never used in WP, and its mapping is never used in DBP. See a bit of advice here _infobox this That is the only way. Furthermore, it's the right way: by making the redirect, the WP editors have decided that it will be harder to maintain the two templates separately because they have a lot of commonality. situation. I was in a similar situation for Geopolitical_organization vs Country. I looked for a discriminator field that would distinguish between the two, and had 3 candidates. Cheers! Vladimir uHi Vladimir. Thank you for answering. Unfortunately, there are no discriminator fields matching in every case (see example in my last mail [1]). That's bad news. I hoped, there is a similar way in DBP to utilize this commonalities. So I have to do it in an unpleasant manner. [1] Regards. Jan Martin uHi Jan, Vladimir, I made a separate admin page for the redirected mappings would be nice to reduce them as much as possible where it is easy and then discuss any problematic mappings On Wed, Aug 12, 2015 at 11:50 AM, Jan Martin Keil < > wrote: uGreat list! Made task help! uHi Jan, I know the problem you are referring to and that we may loose specific classes in some cases. The problem is that a wiki bot can rename the redirected template any time without any notice so imho we should better adapt to the Wikipedia templates as soon as possible. Mapping both templates complicates both the extraction framework and the editorial mapping process. Maybe we should focus on creating better & more advanced conditional mappings WDYT? (ps. I added another column in the redirects list to make it easier identify what needs to be done) Cheers, Dimitris On Thu, Aug 13, 2015 at 11:08 AM, Vladimir Alexiev < > wrote: uGreat! However, it loads the old name in the “new title:” edit box. I added a bit of documentation: Mapping_Guide#Merge_redirected_templates uHi Dimitris. I think an additional condition could solve this problem: {{Condition | operator = redirectedBy | value = Infobox_lake | mapping = {{TemplateMapping | mapToClass = Lake }} }} I am new in this field, so I have no idea how difficult it is to implement this and maybe I miss some points/cases here. But I think it would make this kind of redundant property mappings unnecessary. Will it be refreshed automatically after a while? Regards. Jan Martin uOn Fri, Aug 14, 2015 at 3:35 PM, Jan Martin Keil < > wrote: nice hack ;) I have to check but iirc the redirect information is lost before the mapping gets applied This get refreshed automatically after a change but, only moving a page does not update the list, you have to delete the redirected/merged mapping to trigger the update It is also a good to cleanup the mapping/ontology redirects as the don't offer anything and might confuse the editors uOn Fri, Aug 14, 2015 at 3:28 PM, Vladimir Alexiev < > wrote: This is the default MW form on move, if you know a way to pre-fill the title and the reason I can easily change it" "Query about DBpedia Data" "uHi all, I am trying to use DBpedia by querying it via SPARQL. I think I am missing something basic, as I noticed the information retrieved by SPARQL Query for the entity is a subset of the information present on the corresponding webpage of the entity. For example the page: has information about dbpedia-owl:birthPlace and so on. But the same is not contained in the related RDF file or information retrieved by SPARQL queries on DBpedia SPARQL Endpoint. Can someone please help me about how to obtain the missing pieces of information like birthPlace, deathPlace of via SPARQL query on DBpedia SPARQL Endpoint for the same resource and others? Thanks, Best Regards Prateek DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" Hi all, I am trying to use DBpedia by querying it via SPARQL. I think I am missing something basic, as I noticed the information retrieved by SPARQL Query for the entity is a subset of the information present on the corresponding webpage of the entity. For example the page: Endpoint. Can someone please help me about how to obtain the missing pieces of information like birthPlace, deathPlace of via SPARQL query on DBpedia SPARQL Endpoint for the same resource and others? Thanks, Best Regards Prateek uHi Prateek, My guess is that you're seeing the \"is [property] of\" links automatically generated by the web interface. Birth place is a property of Person, with range on Place. So to get those properties you need to get instances of Person. See: try? Cheers, Pablo On Thu, May 19, 2011 at 9:08 PM, Prateek < > wrote: uHi Pablo, Thanks for the reply. I was trying a very simple query Select * Where { ?p ?o } at the SPARQL endpoint. Will some modification of the query work? Thanks Prateek On 5/19/11 5:51 PM, Pablo Mendes wrote: uHi all, maybe the describe query is what you are looking fordescribe On Thu, May 19, 2011 at 7:07 PM, Prateek < > wrote: uYes, Putting Montgomery, Ohio at the object should get the people that were born or died there. Select * Where { ?s ?p } Or, as Mauricio says, if what you want is all relationships and attributes for a given resource, you can use DESCRIBE. Cheers, Pablo On May 20, 2011 12:07 AM, \"Prateek\" < > wrote: ?o } generated by the web interface. Birth place is a property of Person, with range on Place. So to get those properties you need to get instances of Person. retrieved by SPARQL Query for the entity is a subset of the information present on the corresponding webpage of the entity. not contained in the related RDF file information like birthPlace, deathPlace of via SPARQL query on DBpedia SPARQL Endpoint for the same resource and others?" "Bulgarian DBpedia images extraction" "uHi, We are going to load and host the Bulgarian DBpedia. Currently there are no images extracted from Bulgarian Wikipedia. 1. It would be great if we can extract the images. 2. As an alternative I was thinking to extract the images by SPARQL, and then to perform some processing on them. What do you think? 3. Another option for me is to start playing with the extractor. Cheers, Boyan uHi Boyan, You need to set the bg configuration for the ImageExtractor in [1] In fact you should look at all configuration classes in [2] and see what you can provide for Bulgarian. All these settings will enable some new extractors for bg or improve existing ones Cheers, Dimtiris [1] [2] On Tue, Jan 13, 2015 at 2:17 PM, Boyan Simeonov <" "Build problem" "uHi there! I'm having trouble trying to build dbpedia code inside eclipse. [Disclaimer: I don't know Eclipse well, I don't know Maven ar all!] As far as I can tell, I followed instructions listed at: [Except for installing a scala plugin, since I already had one installed on my eclipse, and was able to write scala projects with it which work fine] I tried to have maven directly download the snv repo, but that did not compile, as a matter of fact I was not even able to see the source files, even though they had been downloaded to my disc. I tried another way, downloading the svn repo manually, and then import in maven. That get slightly better, since I was then able to view the source under Eclipse (though the icon for scala files is a dim blue \"J\", instead of a dark blue \"S\" as in my own project). But the projects (core, dsump, scripts, server, wiktionary) still have a red cross, and project \"live\" has a red \"!\" sign. Looking at the pom.xml (in Eclipse GUI) of \"live\", I see: * Missing artifact com.openlink.virtuoso:virtjdbc:jar:6.1.2:compile But if I look inside maven console, it seems that maven has downloaded this, even multiple times (at lines #13, #82, #152, #244, #252, #291, #330) : 04/03/11 06:21:12 CET: Downloaded 04/03/11 06:21:23 CET: Downloaded 04/03/11 06:21:34 CET: Downloading 04/03/11 06:24:46 CET: Downloaded 04/03/11 06:26:48 CET: Downloaded 04/03/11 06:26:58 CET: Downloaded 04/03/11 06:27:08 CET: Downloaded [I'm skipping 2 or 3 extra occurrences] line #462, I see: 04/03/11 06:29:18 CET: Missing artifact com.openlink.virtuoso:virtjdbc4:jar:6.1.2:compile I find this message quite surprising, since, from previous log, it seems that said jar has already been downloaded 9-10 times! Moreover, the jar is different in live/pom.xml (\"virtjdbc\" vs. \"virtjdbc4\"), which seems odd to me. I really don't know where to look to try and fix that. Could anyone provide me with suggestions? Thanks in advance. uHi there, Since Eclipse seems (very) reluctant to let me build dbpedia's extraction, I thought I'd try maven directly from the command tool So, after getting a suitable maven (2.2.1), I went into the \"extraction\" directory and typed \"mvn clean install\". After a (big) while, I got the following error messages: [ERROR] BUILD FAILURE [INFO] Compilation failure C:\dbpedia\extraction\live\src\main\java\org\dbpedia\extraction\live\publisher\PublishingData.java:[9,31] package org.openanzo.client.jena does not exist C:\dbpedia\extraction\live\src\main\java\org\dbpedia\extraction\live\publisher\PublishingData.java:[42,26] cannot find symbol symbol : variable Converter location: class org.dbpedia.extraction.live.publisher.PublishingData C:\dbpedia\extraction\live\src\main\java\org\dbpedia\extraction\live\publisher\PublishingData.java:[73,26] cannot find symbol symbol : variable Converter location: class org.dbpedia.extraction.live.publisher.PublishingData Concentrating on the first error, I looked into my maven repository, where I found\" .m2\repository\org\openanzo\anzo-client\2.5.1\", which contains, among others, the file \"anzo-client-2.5.1.jar\". Extracting said file, I see that it seems to miss the jena subdir: Positionning myself in the \"org\" directory, and typing \"tree /F\", I get: C:. └───openanzo └───client │ ClosableDatasetServiceDataset.class │ DatasetService$CurrentDataset.class │ DatasetService$DatasetServiceDataset.class │ StoredDatasetserviceDatasetProxy.class │ TrackerManager$StmtSetComp.class │ TrackerManager.class │ └───openrdf DatasetServiceSail$1.class DatasetServiceSail.class DatasetServiceSailConnection$1.class DatasetServiceSailConnection.class In effect, package \"org.openanzo.client.jena\" is not defined in my jar So, I figure that my maven install did not download the right version of openanzo[Below is the trace of where it got it from: Downloading: 3K downloaded (anzo-client-2.5.1.pom) ] But looking in the \"pom.xml\" files of dbpedia's extraction, I find that \"extraction\live\pom.xml\" explicitely sets the version to troublesome \"2.5.1\": org.openanzo anzo-client uIt seems like most of these messages come from the live module. This is a little odd, because in the default branch of the DBpedia extraction_framework (Mercurial) repository, the live module is not included. Are you using the latest version from We changed the repo recently from SVN to Mercurial. Using the new repo should work (it does for me). At any rate, you can also exclude the live module manually, by editing the main pom.xml. Just comment it out in the tag: xml version='1.0' encoding='%SOUP-ENCODING%' core dump uOn Fri, Mar 4, 2011 at 11:36 AM, Max Jakob < > wrote: Thanks, that version was successfully handled by maven! May I suggest to modify the following page? Any URL giving a simple (even simplistic) example on how to use it? [E.g., extracting triples from a URL or locally-copied version thereof.] Best regards. uOn Fri, Mar 4, 2011 at 13:25, Serge Le Huitouze < > wrote: Good point. I updated it. Thanks for the hint! After having run mvn clean install in the root directory, go to the server directory and execute mvn scala:run This will start the DBpedia server. A browser tab will open. You can choose a language and then enter the URI for which you want to extract triples. The DBpedia URI corresponds to the English Wikipedia title (see also Cheers, Max" "DBPedia down?" "uHello! I am getting 500 http codes when trying to dereference dbpedia resources, such as: Is there any maintenance work going atm? Cheers! y uYves Raimond wrote: Hi, Verizon suffered a sizeable network outage for an hour or so :( Seems to be working again now. Regards, ~Tim" "DBpedia 3.3 - different versions of Geo data description" "uDear, there are now three different versions/properties to describe geo locations: 1. dbpprop:latDeg, latMin, latSec etc. 2. geo:lat, geo:long 3. dbpprop:latLong redirect to dbpedia :Paris/latLong/coord Our PoolParty [1] application made use of version 2. How will the Linked Data community handle this kind of problems in the future? If changes in already widely used schemata are made, this causes several problems. Best wishes, Andreas [1] uwrote: Andreas, The solution has always been to make a set of purpose specific named rules in Virtuoso using owl:subproperty. Once the rules are loaded, you simply use a pragma with your SPARQL queries which applies these rules. We should have a standard set of these mapping rules loaded as part of DBpedia in general. Georgi: have you done anything re. the above based on the DBpedia ontology?" "difference between dump files" "uThanks Roberto and Max for the replies. Two more questions: 1. What is the difference between the \"Ontology Infobox Properties\" dataset (the file mappingbased_properties_en.nt) and the \"Raw Infobox Properties\" dataset (the file infobox_properties_en.nt)? 2. What is the difference between the DBPedia->OpenCyc and the OpenCyc -> DBPedia mapping file? Which is better? Thanks, Jonathan. From: Roberto Mirizzi [mailto: ] Sent: Monday, December 20, 2010 7:42 PM To: Subject: Re: [Dbpedia-discussion] infobox property mapping from english to other languges? Il 19/12/2010 14:31, Yonatan ha scritto: Hi, Is there any way I can extract the equivalent names in English of infobox properties of non-English languages (e.g. prénom -> first name). Hi Jonathan, if a mapping in a non-English language doesn't exist, you currently cannot do it. Anyway I'm working on that, we have produced some code to automatically do what you're looknig for. We're going to release soon something about that. DBpedia team suggested me to wait for the transition to Mercurial control management tool instead of SVN for an easier management of different branches. So I'm waiting for that moment. :-) Cheers, roberto Thanks, Jonathan. The content of this e-mail (including any attachments hereto) is confidential and contains proprietary information of the sender. This e-mail is intended only for the use of the individual and entities listed above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication (and any information or attachment contained herein) is strictly prohibited. If you have received this communication in error, please notify us immediately by telephone or by e-mail and delete the message (including the attachments hereto) from your computer. Thank you! uIl 22/12/2010 16:14, Yonatan ha scritto: Hi Jonathan, the answer to your question is here: is based on hand-generated mappings of Wikipedia infoboxes/templates to a newly created DBpedia ontology, while the Infobox Properties are extracted from all infoboxes and templates within all Wikipedia articles and are not cleaned or merged. In this case the coverage is larger but data is relatively noisy. Actually on the DBpedia download page there is only the DBpedia -> Cyc linkage. Here you can find some info about how the linkage is obtained: Linking from DBpedia to Cyc means you have triples such as: . while Cyc to DBpedia should be: . The difference is who links whom. Merry Christmas! :-) roberto" "page links" "uJust some thoughts here: (1) Sometimes page links get repeated. I think this is just because page A has N links to page B. This doesn't have much semantic impact, but it does bulk up the files a bit (though less w/ bz2) and makes more work for my importer script (2) Some page links point to articles that don't exist. That's a good thing, because \"broken links\" are important to the whole wiki concept. Right now my system is ignoring that stuff, but I've done plenty of stuff with link analysis where you can get good insight about a some set of documents S by looking at links to the expanded set of documents S' that includes (real or imagined) documents that are referred to in document S. (3) Might be nice to extract the anchor text together with the link, though then we're not talking about a triple anymore and have to put in some of those dreaded blank nodes I've been think about training decision rules for a namexer by capturing the text context that pagelinks occur in, but I'd have to write my own extractor to do that. u2010/5/5 Paul Houle < >: That can be important to some extent when computing the PageRank of the wikipedia graph. Or other graph algorithms to mesure the proximity / relatedness of entities. BTW, that would be great if the DBpedia project could compute and distribute the PageRank or the TunkRank [1] values for the DBpedia resources based on the data of the page links graph. This is a really good scoring heuristic when performing fuzzy text named queries with several homonymic matches. [1] Or this could be extracted in an adhoc CSV file since I don't really see the point in having those in a knowlege base / triple store but this is precious data for training machine learning based NLP models. uOlivier Grisel wrote: Olivier, Have you looked at this interface to DBpedia: Also look at the Entity Rank details in the \"About\" section. Basically, you have two ranking schemes in place: 1. Entity Rank u2010/5/6 Kingsley Idehen < >: uOlivier Grisel wrote: u2010/5/6 Kingsley Idehen < >: uOlivier Grisel wrote: uIl 06/05/2010 13.11, Kingsley Idehen ha scritto: uRoberto Mirizzi wrote: uIl 07/05/2010 15:51, Kingsley Idehen ha scritto: We ask search engines through their APIs, looking for some co-occurrence-like/google-similarity-distance-like measure between two DBpedia resources. uRoberto Mirizzi wrote: We do this sort of thing in our sponger cartridges, so it really comes down to simply slotting this into Virtuoso via the Entity Rank customization capability. As you can imagine, this feature is quite esoteric, hence the customization slot etc uRoberto Mirizzi wrote: There are two fun things I've done with \"link\" data sets. One of them is that you treat the in-links and out-links of documents as a 'vector' and compute vector space distances. I've done this with scientific papers [arxiv.org] When the local fan-in and fan-out is high, this can be a good way to find 'related documents'. Developing the NY Pictures site, I generated a list of \"topics related to NYC\" and then computed two aggregates over link targets: (i) incoming links from all other pages, (ii) incoming links from the NYC topic Television networks (many of which were based in NYC) were very strong topics by measure (i) but were much less strong by measure (ii). The New York Times was still very strong just looking at links 'local' topics, I think because the NYT was often cited to support WP articles. Doing the vector space stuff I was talking about above, I found that the local fan-in and fan-out was critical for it working In principle it should work very poorly for scientific papers that are cited only two or three times [most of them] but it does better than you'd think, since there's a strong positive correlation between how some papers get cited [one of mine is always cited together with another paper that came out in the same issue of Physical Review] I think looking for link density between categories, spatial regions, types, etc should be a lot of fun and fruitful since the statistics are going to be better. Of course, the first thing you see are the crazy outliers uOn 5/10/10 4:55 PM, Paul Houle wrote: We've found it useful to use the page links to compute a PMI (pointwise mutual information) metric for pairs of pages. We used this to help map entity mentions in text to Wikipedia entities, resolving possible ambiguities (e.g., among the seven George Bushes in Wikipedia. uIl 10/05/2010 23.51, Tim Finin ha scritto: How did you calculate PMI? Do you have some reference to show? uOn 5/10/10 7:29 PM, Roberto Mirizzi wrote: > We assume the standard definition of PMI, e.g., [1]. We were interested in estimating the PMI for pairs of concepts (entities, events, ) that Wikipedia articles denote. The Wikipedia convention is that if you mention a Wikipedia object, you link the first mention to the appropriate Wikipedia article. This has advantages over using, for example, a search engine to find the PMI of two strings that are entity mentions (e.g., \"George Bush\", \"Mr. Quale\") since both are ambiguous. We estimate the PMI for two Wikipedia articles based just on their mentions in other Wikipedia articles, So, given a pair of Wikipedia articles X and Y we compute log(p(X&Y;)/p(X)*p(Y)) where p(X&Y;) is the probability that a Wikipedia page will link to both X and Y and p(X) is the probability that a page will link to X. If X itself links to Y, we count that as contributing to p(X&Y;). We found ~40M pairs with a non-zero value. We are using this in work on extracting linked data from tables [2]. [1] [2] uIl 11/05/2010 2.59, Tim Finin ha scritto: It sounds really interesting. I am reading your work soon. Anyway, I've just two questions: how do you calculate the \"probability\", I mean, starting from a number of incoming links? The second question is: how did you store the 40M pairs?" "Redirects dataset available" "uHi all, a new dataset with all redirects between Wikipedia Articles is now available for testing purpose [1]. It contains 2M redirects, which is quite a lot. Serving this data as Linked Data would make the DBpedia experience more convenient, because users can guess Dbpedia URIs as they do with Wikipedia URLs. Like the disambiguation dataset, it is at the moment neither available at our public Sparql endpoint nor as Linked Data. I would like to take your feedback first before making it available. Cheers, Georgi [1] redirects.nt.bz2 uOn 12 Nov 2007, at 00:33, Georgi Kobilarov wrote: Great! I agree. This should definitely go into the endpoint. Richard" "SPARQL: restricting DESCRIBE queries" "uHi, is there any way to limit the number of triples returned by a DESCRIBE query? The LIMIT clause of sparql obviously applies to the number of result bindings, but each result binding may lead to an arbitrary number of triples. Consider the following queries against dbpedia: SELECT ?concept WHERE { ?concept rdfs:label \"Berlin\"@en . } LIMIT 1 returns one result, whereas SELECT ?concept WHERE { ?concept rdfs:label \"Berlin\"@en . } LIMIT 2 returns two results. Now, DESCRIBE ?concept WHERE { ?concept rdfs:label \"Berlin\"@en . } LIMIT 1 returns 28 triples describing dbpedia:Category:Berlin, while DESCRIBE ?concept WHERE { ?concept rdfs:label \"Berlin\"@en . } LIMIT 2 returns some ~4500 triples because it includes the description of dbpedia:Berlin. Any suggestions how I could limit the number of triples returned by such a query? Best, Bernhard uOn 7 Sep 2009, at 15:36, Bernhard Schandl wrote: Yes. There is no way to limit the number of triples in DESCRIBE, because the LIMIT clause in DESCRIBE applies to the number of described *resources*, not to the number of triples in those descriptions. I would perhaps execute the SELECT query that you give below, and then for each of the results of that query, run a \"SELECT * WHERE { ? p ?o}\" with a LIMIT clause, followed by \"SELECT * WHERE {?s ?p }\". Also note that LIMIT is not truly meaningful without ORDER BY. Best, Richard u2009/9/8 Richard Cyganiak < >: You won't however pick up the contents of any nested blank nodes using this method that would have been pulled in by the DESCRIBE implementations on most SPARQL providers, both because they won't have URI's to perform the SELECT on and because the final two sets of queries only pull in one level of nodes. Also, any blank node references in the results of the final SELECT queries can't be joined because they are independent. This area of SPARQL is very messy due to blank nodes, particularly as DESCRIBE hasn't been fully specified and doesn't have configurable behaviour with respect to the GRAPH patterns it uses when resolving the triples related to the resource so you either use DESCRIBE and put up with the lack of a triple LIMIT or use it and get stuck with everything at once even if you didn't want the ?s ?p triples (where ?s is a URI) but you did want blank node recursive closure. ORDER BY is quite slow on all current SPARQL implementations with large datasources in the background. It is no wonder people don't use it even though they technically should. And sometimes you even get errors using ORDER BY [1] (when you didn't previously ;) ) as I did when I attempted to add ORDER BY to this query: SELECT ?concept ?p ?o WHERE { ?concept rdfs:label \"Berlin\"@en . ?concept ?p ?o .} ORDER BY ?concept LIMIT 5000 I am not sure if that error is a Virtuoso bug or a configuration bug. Cheers, Peter [1] nbbd94 uHi Peter, We are looking into the cause of this failing ORDER BY query you report Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 7 Sep 2009, at 23:08, Peter Ansell wrote:" "Dbpedia live - latest dump file - how to get one?" "uHello! I am trying to setup live dbpedia data on my box using Dbpedia live mirror and the latest dump that I find at June 2015. Would a new dump be released anytime soon? My live mirror update has been running for a few days now. Alternatively, if there was a way to speed up the catch up process, that will be great as well! (I have disabled auto indexing on my VOS instance). Thank you. Hello! I am trying to setup live dbpedia data on my box using Dbpedia live mirror and the latest dump that I find at you." "DBpedia on local Virtuoso server" "uHi everyone, i'm trying to load DBpedia into a Virtuoso local server (the open source edition). After building and installing Virtuoso, came the moment to load the data. I downloaded all the .nt dumps i'm interested in inside a directory and i used the script i found here ./importDumps.sh 1111 dba mypassword . i used . as 4th parameter because the script was in the same directory of the dumps. It took really few minutes to complete the loading (suspiciously, because i also loaded infoboxes and other big dumps) resulting in no errors and no bad entries. Then i went to the conductor, to the rdf label and there was no dbpedia.org between the graphs and if i try some queries that work on the public dbpedia sparql endpoint, they give no results. What could the problem be? It's the first time i'm using virtuoso and i really don't know what to check. How can i realize if the dumps have been loaded or not? If they haven't, how could i do it in some other way? Once the dumps will be loaded, could i query some sparql endpoint out of the box or i will need some more work to get it running? Thanks everyone for the answers and excuse me for my noobness, Piero uHi Piero, The following document provides steps to load the latest DBpedia datasets into Virtuoso using the Virtuoso Bulk loading scripts we now provide: This is the method used for loading the current online dbpedia.org datasets, the scripts in the blog post you reference are based on an older script we provided previously and should still work in theory although we would recommend the newer Virtuoso Bulk RDF load scripts. I am also including the virtuoso-users email address on source forge in this reply, as assistance can also be received from this mailing list with any Virtuoso related issues, thus you probably should subscribe to the list at: Let us know how you get on Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 1 Feb 2010, at 14:45, Piero Molino wrote: uHi Piero, Good to hear of your success using the Virtuoso RDF bulk loader scripts for loading the DBpedia datasets. Their is a know issue with the Conductor RDF -> Graphs tab failing to displays graphs when their are many to display which is scheduled to be fixed Posting reply to DBpedia and VOS mailing lists so others can benefit from your success Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 3 Feb 2010, at 15:07, Piero Molino wrote:" "JENA API ACCESS DBPEDIA" "uHi, I encountered a strange problem: I want to access remote dbpedia sparql endpoint throuth JENA API. I got different results with two different sparql query in JENA API. But I can get results with these two sparql query at when I access with \"SELECT DISTINCT ?company where {?company a } LIMIT 3\"; I can get result as below. log4j:WARN No appenders could be found for logger (com.hp.hpl.jena.util.FileManager). log4j:WARN Please initialize the log4j system properly. log4j:WARN See info. uHi, The only problem I see is the angle bracket behind Germany, which leads to a syntax error. Try again without it. More information about SPARQL queries at Am 03.09.2013 11:13, schrieb bocai:" "Mapping types" "uHi,I've been trying make use of the dbpedia Types for some concepts and have discovered some of them are either missing or inaccurate, and would like to be able to amend them.I've been told the mapping tool is not available and that I'll have to map them manually, I'm happy to do so, but can't find any instructions on how this is done, I can only find instructions on using the tool for mapping.Any help on this would be hugely appreciated.Best,Jo uHi Joe, Information about creating mappings can be found here: Cheers, Alexandru On Jul 23, 2014 11:06 AM, \"Jo Kent\" < > wrote: uHi,Sorry if I'm being really dense, but I've gone through the pages again, and I still can't see where I can edit them. But this page about the human brain: What is the URL I'd need to go to to amend that incorrect mapping?Many thanks,Jo Date: Wed, 23 Jul 2014 11:49:23 +0200 Subject: Re: [Dbpedia-discussion] Mapping types From: To: CC: Hi Joe, Information about creating mappings can be found here: Cheers, Alexandru On Jul 23, 2014 11:06 AM, \"Jo Kent\" < > wrote: Hi,I've been trying make use of the dbpedia Types for some concepts and have discovered some of them are either missing or inaccurate, and would like to be able to amend them.I've been told the mapping tool is not available and that I'll have to map them manually, I'm happy to do so, but can't find any instructions on how this is done, I can only find instructions on using the tool for mapping. Any help on this would be hugely appreciated.Best,Jo uHi Jo, Im not sure but it seems to me you are confusing wikipedia articles with infoboxes. Infoboxes are predefined templates that are used in a lot of different articles, please refer to the Wikipedia page that explains infoboxes [1] . The mappings wiki is just a mediawiki instance and works like any other mediawiki. First you have to register and then request editor rights on the mailing list. When you have registered, just tell me your account name and I'll give you editor rights. Afterwards you will be able to edit the wiki pages. Cheers Alexandru [1] On Jul 23, 2014 3:15 PM, \"Jo Kent\" < > wrote: uHi, Indeed, such a mapping doesn't exist. The error you've found comes from the inferred types - check the dump \"Mapping-based Types (Heuristic)\" at [1] - which are loaded to the sparql endpoint just as other type info. And yes, we should think how to fix it in the next version. Best, Volha [1] On 7/23/2014 3:15 PM, Jo Kent wrote: uHi,I already have editor rights and have been able to edit and add classes to the ontology, I just don't know how these are mapped to the articles.It's clear on the dbpedia Brain article that it has those types mapped, but I can't see a way of amending them. It would be nice to be able to correct errors, and I really want to be able to correct a few other pages to, but I'm finding it really hard to work out how to access the wiki page for the brain article to edit it.Many thanks,Jo Date: Wed, 23 Jul 2014 15:26:14 +0200 Subject: RE: [Dbpedia-discussion] Mapping types From: To: CC: Hi Jo, Im not sure but it seems to me you are confusing wikipedia articles with infoboxes. Infoboxes are predefined templates that are used in a lot of different articles, please refer to the Wikipedia page that explains infoboxes [1] . The mappings wiki is just a mediawiki instance and works like any other mediawiki. First you have to register and then request editor rights on the mailing list. When you have registered, just tell me your account name and I'll give you editor rights. Afterwards you will be able to edit the wiki pages. Cheers Alexandru [1] On Jul 23, 2014 3:15 PM, \"Jo Kent\" < > wrote: Hi,Sorry if I'm being really dense, but I've gone through the pages again, and I still can't see where I can edit them. But this page about the human brain: dbpedia-owl:RecordLabel What is the URL I'd need to go to to amend that incorrect mapping?Many thanks,Jo Date: Wed, 23 Jul 2014 11:49:23 +0200 Subject: Re: [Dbpedia-discussion] Mapping types From: To: CC: Hi Joe, Information about creating mappings can be found here: Cheers, Alexandru On Jul 23, 2014 11:06 AM, \"Jo Kent\" < > wrote: Hi,I've been trying make use of the dbpedia Types for some concepts and have discovered some of them are either missing or inaccurate, and would like to be able to amend them.I've been told the mapping tool is not available and that I'll have to map them manually, I'm happy to do so, but can't find any instructions on how this is done, I can only find instructions on using the tool for mapping. Any help on this would be hugely appreciated.Best,Jo" "problem with iri" "uHi all, I setting up a local Dbpedia Indonesia with virtuoso, but I found some problem with IRI.  dbpedia-owl:wikiPageInterLanguageLink * * * * * * * dbpedia-cs:Sukarno * * * dbpedia-de:Sukarno * dbpedia:Sukarno * http://eo.dbpedia.org/resource/Soekarno * dbpedia-es:Achmed_Sukarno * http://et.dbpedia.org/resource/Sukarno * http://eu.dbpedia.org/resource/Sukarno * http://fa.dbpedia.org/resource/????_??????? * http://fi.dbpedia.org/resource/Sukarno * dbpedia-fr:Soekarno * http://fy.dbpedia.org/resource/Achmed_Soekarno * http://he.dbpedia.org/resource/????_?????? * http://hi.dbpedia.org/resource/??????? * http://hr.dbpedia.org/resource/Sukarno * http://hu.dbpedia.org/resource/Sukarno * http://io.dbpedia.org/resource/Sukarno * dbpedia-it:Sukarno * dbpedia-ja:???? * http://jv.dbpedia.org/resource/Soekarno * http://ka.dbpedia.org/resource/??????? * dbpedia-ko:???? * http://map-bms.dbpedia.org/resource/Soekarno * http://min.dbpedia.org/resource/Soekarno * http://mk.dbpedia.org/resource/?????_??????? * http://mr.dbpedia.org/resource/??????? * http://ms.dbpedia.org/resource/Sukarno * http://nl.dbpedia.org/resource/Soekarno * http://no.dbpedia.org/resource/Sukarno * dbpedia-pl:Sukarno * dbpedia-pt:Sukarno * http://ro.dbpedia.org/resource/Achmed_Sukarno * dbpedia-ru:??????? * http://sa.dbpedia.org/resource/??????? * http://sco.dbpedia.org/resource/Sukarno * http://simple.dbpedia.org/resource/Sukarno * http://sk.dbpedia.org/resource/Sukarno * http://sl.dbpedia.org/resource/Sukarno * http://sr.dbpedia.org/resource/??????? * http://su.dbpedia.org/resource/Sukarno * http://sv.dbpedia.org/resource/Sukarno * http://th.dbpedia.org/resource/???????? * http://tl.dbpedia.org/resource/Sukarno * http://tr.dbpedia.org/resource/Sukarno * http://uk.dbpedia.org/resource/??????? * http://vi.dbpedia.org/resource/Sukarno * http://war.dbpedia.org/resource/Sukarno * http://yo.dbpedia.org/resource/Sukarno * http://zh.dbpedia.org/resource/??? * http://zh-min-nan.dbpedia.org/resource/Sukarno I don't know why there are many character changed by \"?\". Please, help me   Regards, Riko Hi all, I setting up a local Dbpedia Indonesia with virtuoso, but I found some problem with IRI. dbpedia-owl: wikiPageInterLanguageLink http://ar.dbpedia.org/resource/????_??????? http://bcl.dbpedia.org/resource/Sukarno http://be.dbpedia.org/resource/??????? http://be-x-old.dbpedia.org/resource/??????? http://bg.dbpedia.org/resource/??????? http://ca.dbpedia.org/resource/Sukarno dbpedia-cs :Sukarno http://cy.dbpedia.org/resource/Sukarno http://da.dbpedia.org/resource/Achmed_Sukarno dbpedia-de :Sukarno dbpedia :Sukarno http://eo.dbpedia.org/resource/Soekarno dbpedia-es :Achmed_Sukarno http://et.dbpedia.org/resource/Sukarno http://eu.dbpedia.org/resource/Sukarno http://fa.dbpedia.org/resource/????_??????? http://fi.dbpedia.org/resource/Sukarno dbpedia-fr :Soekarno http://fy.dbpedia.org/resource/Achmed_Soekarno http://he.dbpedia.org/resource/????_?????? http://hi.dbpedia.org/resource/??????? http://hr.dbpedia.org/resource/Sukarno http://hu.dbpedia.org/resource/Sukarno http://io.dbpedia.org/resource/Sukarno dbpedia-it :Sukarno dbpedia-ja :???? http://jv.dbpedia.org/resource/Soekarno http://ka.dbpedia.org/resource/??????? dbpedia-ko :???? http://map-bms.dbpedia.org/resource/Soekarno http://min.dbpedia.org/resource/Soekarno http://mk.dbpedia.org/resource/?????_??????? http://mr.dbpedia.org/resource/??????? http://ms.dbpedia.org/resource/Sukarno http://nl.dbpedia.org/resource/Soekarno http://no.dbpedia.org/resource/Sukarno dbpedia-pl :Sukarno dbpedia-pt :Sukarno http://ro.dbpedia.org/resource/Achmed_Sukarno dbpedia-ru :??????? http://sa.dbpedia.org/resource/??????? http://sco.dbpedia.org/resource/Sukarno http://simple.dbpedia.org/resource/Sukarno http://sk.dbpedia.org/resource/Sukarno http://sl.dbpedia.org/resource/Sukarno http://sr.dbpedia.org/resource/??????? http://su.dbpedia.org/resource/Sukarno http://sv.dbpedia.org/resource/Sukarno http://th.dbpedia.org/resource/???????? http://tl.dbpedia.org/resource/Sukarno http://tr.dbpedia.org/resource/Sukarno http://uk.dbpedia.org/resource/??????? http://vi.dbpedia.org/resource/Sukarno http://war.dbpedia.org/resource/Sukarno http://yo.dbpedia.org/resource/Sukarno http://zh.dbpedia.org/resource/??? http://zh-min-nan.dbpedia.org/resource/Sukarno I don't know why there are many character changed by \"?\". Please, help me Regards, Riko uHi Riko, Can you try our modified version of DBpedia vad? Cheers, Dimitris On Thu, Apr 4, 2013 at 9:30 AM, Riko Adi Prasetya < >wrote: uOn 4/4/13 3:03 AM, Dimitris Kontokostas wrote: Awesome!! I like the fact that you also kicked off a Github project for this :-) uAlready talked to Patrick Van Kleef on this. This is a forked version, once you publish the DBpedia vad on a stand-alone project we'll try to merge them. Best, Dimitris On Thu, Apr 4, 2013 at 2:36 PM, Kingsley Idehen < >wrote: u0€ *†H†÷  €0€1 0 + uIs the server online? Can you give me an address to just view the problematic webpages and a sparql endpoint for the data? On Fri, Apr 5, 2013 at 12:55 PM, Riko Adi Prasetya < >wrote: uHi Riko, maybe the vad file is outdated and needs rebuilt try this,: run from console $isql 1111 dba dbapassword # where 1111 the virtuoso default port then run: from 1) load dbpedia_local.sql ; 2) load dbpedia_init.sql ; from 3) load description.sql ; For the first 3 steps you can also copy-paste-run the file contents in web-based ISQL from conductor interface from dav browser replace the contents of the existing description.vsp with data from description.vsp On Mon, Apr 8, 2013 at 9:11 AM, Riko Adi Prasetya < >wrote:" "Redirection" "uKingsley, On reflection, I have come to the conclusion that it was rather dumb of me to assume that I could do arbitrary URL redirects, just because the 303 mechanism is used to distinguish pages from RDF resources. In effect, I am asking any RDF-aware tool that wants to use my resources to support redirection at a point when it isn't expecting to have to. Even if you were to add this support for your RDF browser, this wouldn't change the situation for the next tool I (or someone else) tried to use. Therefore I now plan to build the RDF generation directly into the ASP which handles 404s, so that the \"resource\" URL actually delivers RDF. Thank you for your support and patience through this learning process. Richard uRichard Light wrote: Richard, No problem :-) I notice you've pinged the LOD mailing list which is a vibrant community re. content negotiation related matters." "500 SPARQL Request Failed: HttpException: 500 SPARQL Request Failed" "uHi, I am trying to run numerous queries to DBPedia through Sparql using my application. All queries work perfectly fine, when I execute them one by one, but when I run all queries at one, I get the following exception: 500 SPARQL Request Failed: HttpException: 500 SPARQL Request Failed The queries are the following: SELECT DISTINCT ?d WHERE { ?cl dbpprop:fullname ?o . ?o \"replacementString\" . ?cl rdf:type ?c . ?c rdfs:label ?d . } LIMIT 2 or SELECT DISTINCT ?b WHERE { ?cl rdfs:label \"replacementString\" . ?cl rdfs:subClassOf ?a . ?a rdfs:label ?b . } LIMIT 2 The problem arises mainly when I include the first query with the option. Is it something with the huge volume of traffic running? Is it with mine application or your server? Thank you for your time. Regards, Mirela Hi, I am trying to run numerous queries to DBPedia through Sparql using my application. All queries work perfectly fine, when I execute them one by one, but when I run all queries at one, I get the following exception: 500 SPARQL Request Failed: HttpException: 500 SPARQL Request Failed The queries are the following: SELECT DISTINCT ?d WHERE { ?cl dbpprop:fullname ?o . ?o \"replacementString\" . ?cl rdf:type ?c . ?c rdfs:label ?d . } LIMIT 2 or SELECT DISTINCT ?b WHERE { ?cl rdfs:label \"replacementString\" . ?cl rdfs:subClassOf ?a . ?a rdfs:label ?b . } LIMIT 2 The problem arises mainly when I include the first query with the option. Is it something with the huge volume of traffic running? Is it with mine application or your server? Thank you for your time. Regards, Mirela" "Use of dbpedia-owl classes in other ontologies" "uI'm curious about what people think of the appropriateness of using a class from the dbpedia ontology as the range in another ontology. Here's the particular case I have in mind. I'm working on an app that will ask faculty at my school to give additional info about their courses by using references to wikipedia pages, and hence to dbpedia. So we'd have triples like: ex:Course1 univ:studiesPerson dbres:Michelangelo . The range of univ:studiesPerson should be just fine as dbpedia-owl:Person. But, imagine this is an art class, and a particular work of art they are studying is the Pieta. It would be nice to have a property univ:studiesWork with range dbpedia-owl:Work to make a triple like: ex:Course1 univ:studiesWork dbres:Piet%C3%A0_(Michelangelo) Inference would then say that the resource for the Pieta is a dbpedia-owl:Work, which I think makes conceptual sense, but I worry that it might run afoul of how those classes came about and how they work (no pun intended). So, what's the consensus on using dbpedia-owl classes as the range for properties in other ontologies? Thanks much, Patrick Murray-John http://semantic.umwblogs.org uHi Peter, There was a long discussion on w3c-semweb mailinglist about the dbpedia ontology. But that discussion focused on our use of domains and ranges in dbpedia ontology properties. Some people complained that those properties can't / shouldn't be used in other datasets because our domains and ranges might led to strange reasoning results. However, I'd say you're save using dbpedia ontology classes, especially those first-level-hierarchy classes such as Person, Work, Place, Organization, etc. Thing is, there're no descriptions of dbpedia ontology classes available. What is a Person? What is a Work? What is an Organization? In my opinion, common sense provides answers to these questions. Whether or not common sense is a valid criterion is a different question Other opinions, anybody? Best, Georgi" "Ampersands in URLs" "uI've no idea how widespread this is, but I just failed to get a response for \"Natural_history\" because the RDF/XML contains this URL: and is therefore not well-formed. If resource URLs are to contain ampersands, they surely need to be URLencoded? (Better still not to have them there in the first place?) Richard uOn 21 February 2012 10:41, Richard Light < > wrote: I think you're confusing URL encoding (%26) with XML escaping (&), but that should be XML escaped, certainly. uJimmy, Not, I'm not confused. :-) I just thought that if the \"&\" were URLencoded it wouldn't need to be XML escaped, because as you say it would then read \"%26\", and so wouldn't cause problems to the XML parser. And I thought URLencoding should happen here. To quote a random Web source [1]: \"Only alphanumerics [0-9a-zA-Z], the special characters \"$-_.+!*'(),\" *[not including the quotes - ed]*, and reserved characters used for their reserved purposes may be used unencoded within a URL.\" This seems to be quite a common error: I have tripped over three or four dbpedia URLs containing ampersands in the course of the morning. Unfortunately they br up any resource which mentions them, as well as their \"home page\". Thus I can't access the resource describing lions because it contains a reference to: (!) Richard [1] On 21/02/2012 13:26, Jimmy O'Regan wrote: uOn 21 February 2012 13:47, Richard Light < > wrote: Fair enough. That the URL isn't XML escaped in RDF/XML is clearly and unambiguously a bug; that it isn't URL escaped is more a matter for discussion, but the general consensus will probably be 'do what Wikipedia do', which is to not escape ampersands. uOn 2/21/2012 8:47 AM, Richard Light wrote: My first impression was disappointment that I can't find how many hit dice that monsters like this have: (Maybe somebody just has to RDFize the monster database from Nethack) The amperstands, of course, can be blamed on Wikipedia, which uses them extensively, and generations of web browsers which have been tolerant of broken HTML. Yep, you can write and have it work almost the way that you want. In general it's part of the problem that URL (URI) encoding is screwy in DBpedia, the problems don't seem to be completely understood (it seems like there is more than one \"standard\" to conform to) and nobody is interested in doing anything about it. If you can wait a month or so a RDF product will be become available that will will be broadly similar to DBpedia in scope but will be lacking these problems and most of the other problems too. (Although it won't have all of those delicious \"List of\" pages) uOn 21/02/2012 15:40, Paul A. Houle wrote: I'm in no hurry. I was just revisiting a \"Culture Grid Linked Data\" hack which I put together last year, to see how much of it was now broken (\"data rot\", c.f. link rot). The idea is that Culture Grid search result terms are looked up \"blind\" in dbpedia. dbpedia is also used to mediate the search term, en route to grabbing a relevant entry from the BBC Wildlife Linked Data resources. Hence the interest in lions (and bears, and tigers - all of them scuppered by this D&D; monsters list URL). Richard uHi, I have met the same problem recently, on an example which worked not so long before (2 weeks top): If data didn't change recently on dbpedia sparql endpoint could that be a server setting that has been changed ? Julien On 02/21/2012 05:03 PM, Richard Light wrote: uHi Julien, It looks like a serialization problem. We are working on a fix and will install it probably later today. I will let you know as soon as it is fixed. Patrick uHi Julien, This problem has been fixed. Patrick uExcellent. I can once again go on a \"bear hunt\" with my Culture Grid hack. :-) [1] Now with working timeline all done with AJAX calls and XSLT 1.0. Dbpedia data is used to redirect the search term to the relevant BBC Wildlife concept. Richard [1] On 23/02/2012 07:57, Patrick van Kleef wrote: u0€ *†H†÷  €0€1 0 + uOn 02/23/2012 08:57 AM, Patrick van Kleef wrote: uDear all, I just checked a few specs to figure out what would be the best policy for DBpedia regarding URI encoding. In summary, I think DBpedia should encode as few characters as possible, e.g. use '&', not '%26'. The URI spec [1] has a lot of special cases, but in the end it's quite clear that in our case we do not HAVE to encode most special characters like '&'. See 3.3 Path Component. More importantly, the RDF spec includes the following note [2]: Because of the risk of confusion between RDF URI references that would be equivalent if derefenced, the use of %-escaped characters in RDF URI references is strongly discouraged. Could hardly be clearer A related, but different issue is how Wikipedia and Virtuoso dereference URIs. Wikipedia is very lenient: \"&_(EP)\" [3] is equivalent to \"%26_%28EP%29\" [4]. Even \"OS%2F2\" [5] is treated as equivalent to \"OS/2\" [6]. (Not sure which of these bahaviors is or isn't violating the URI spec). Virtuoso on dbpedia.org is very strict: it only returns data for \"OS/2\" [7] and \"&_%28EP%29\" [8], but empty pages for all other encoding variants. Christopher [1] [2] [3] [4] [5] [6] [7] [8] On Tue, Feb 21, 2012 at 15:04, Jimmy O'Regan < > wrote: uOn 3/5/2012 7:14 PM, Jona Christopher Sahnwaldt wrote: I came up with the following encoding function for the path component of a URI based on a close reading of RFC 2397. Any disagreements? public static class IRIEscaper { StringBuffer out; public String escape(String key){ out=new StringBuffer(); final int length = key.length(); for (int offset = 0; offset < length; ) { final int codepoint = key.codePointAt(offset); transformChar(codepoint); offset += Character.charCount(codepoint); } return out.toString(); } private void transformChar(int cp) { char[] rawChars=Character.toChars(cp); if(acceptChar(rawChars,cp)) { out.append(Character.toChars(cp)); } else { percentEncode(rawChars); } } private void percentEncode(char[] rawChars) { try { byte[] bytes=new String(rawChars).getBytes(\"UTF-8\"); for(byte b:bytes) { out.append('%'); out.append(Integer.toHexString(0x00FF & (int) b).toUpperCase()); } } catch(UnsupportedEncodingException ex) { throw new RuntimeException(ex); } } // // this code should implement the 'ipchar' production from // // // private boolean acceptChar(char[] chars,int cp) { if(chars.length==1) { char c=chars[0]; if(Character.isLetterOrDigit(c)) return true; if(c=='-' || c=='.' || c=='_' || c=='~') return true; if(c=='!' || c=='$' || c=='&' || c=='\'' || c=='(' || c==')' || c=='*' || c=='+' || c==',' || c==';' || c=='=' || c== ':' || c=='@') return true; if (cp<0xA0) return false; } if(cp>=0xA0 && cp<=0xD7FF) return true; if(cp>=0xF900 && cp<=0xFDCF) return true; if(cp>=0xFDF0 && cp<=0xFFEF) return true; if (cp>=0x10000 && cp<=0x1FFFD) return true; if (cp>=0x20000 && cp<=0x2FFFD) return true; if (cp>=0x30000 && cp<=0x3FFFD) return true; if (cp>=0x40000 && cp<=0x4FFFD) return true; if (cp>=0x50000 && cp<=0x5FFFD) return true; if (cp>=0x60000 && cp<=0x6FFFD) return true; if (cp>=0x70000 && cp<=0x7FFFD) return true; if (cp>=0x80000 && cp<=0x8FFFD) return true; if (cp>=0x90000 && cp<=0x9FFFD) return true; if (cp>=0xA0000 && cp<=0xAFFFD) return true; if (cp>=0xB0000 && cp<=0xBFFFD) return true; if (cp>=0xC0000 && cp<=0xCFFFD) return true; if (cp>=0xD0000 && cp<=0xDFFFD) return true; if (cp>=0xE1000 && cp<=0xEFFFD) return true; return false; } } uChristopher, I don't have strong views about the details of URI encoding. (In my view, both Wikipedia and dbpedia should use simple numeric identifiers for each concept, rather than these stupid and mutable made-up-from-the-page-title ones. But that's maybe a separate thread.) However, I think I need to point out that the reason I started this thread is that URLs containing '&' lead to broken (non-well-formed) RDF/XML. So I think that '%26' is mandatory, whatever happens to other characters. Richard On 06/03/2012 00:14, Jona Christopher Sahnwaldt wrote: uRichard, Only if the XML serializer is broken - '&' must be encoded, that's standard practice in XML. There was a problem in Virtuoso, but that has been fixed: In other words: changing DBpedia URIs is not the right way to fix a broken XML serializer. :-) Christopher On Tue, Mar 6, 2012 at 09:22, Richard Light < > wrote: uHi Paul, thanks for the code! Looks good. A few minor things: - \"public static class\"? - In the comment, it should be not - I think Integer.toHexString(0x00FF & (int)b) may generate one-character codes. - If you're extremely performance-consciuos, you could avoid creating a few temporary objects, and use a decision table for many characters, let's say < 0x10000. I didn't check the codes >= 0xA0, I trust you copied them correctly. ;-) As for the special ASCII chars, yes, those are the ones that are allowed by RFC 3986. Its predecessor explains nicely why the others forbidden. I compiled a blacklist below. I added 'ok' to those that IMHO clearly must be escaped. For some, I don't really see a reason why they are forbidden (I wrote 'don't know' below), but they are pretty rare and we sure should escape them. For others, I still don't see a reason, but MediaWiki doesn't use them so we shouldn't either [2] (I wrote 'ok for wikipedia'). We shouldn't even generate any URIs containing any of these characters, escaped or not, subject or object. (They may occur in where we must escape them.) The one character that we should not escape is the slash. MediaWiki actually unescapes it when it is used in an internal link. [1] But that's an implementation detail. It may be cleaner to have a generic IRIEscaper and afterwards just do uri.replace(\"%2F\", \"/\"). forbidden characters: \" - don't know # - ok % - ok / - not ok < - ok ? - ok [ - ok for wikipedia \ - don't know ] - ok for wikipedia ^ - don't know ` - don't know { - ok for wikipedia | - ok for wikipedia } - ok for wikipedia Regards, JC [1] [2] On Tue, Mar 6, 2012 at 02:09, Paul A. Houle < > wrote:" "Gathering history of wikipedia categories" "uHello, I am writing my master's thesis with the categorization data of wikipedia. I could find the raw data I needed, except the categorization of a wikipedia page on a specific date and how it changes with the time. Is there a data set on dbpedia where I can gather this information directly or indirectly? I looked to some datasets, but unfortunately I couldn't find the right one. Thanks in advance! Best Wishes, Altug Akay uHi Altug, This is probably matter of the wikipedia itself" "Just when you thought it was safe" "uOn Tue, Apr 27, 2010 at 10:13 PM, Paul Houle < > wrote: You might find FAO's geopolitical ontology saves you some work - -> rapper: Parsing returned 12776 triples Dan-Brickleys-MacBook-Pro:extraction danbri$ rapper ~/Downloads/geopolitical_v09.owl | grep \"rdf-syntax-ns#type\" | grep \"#self_governing\"| wc -l rapper: Parsing URI file:///Users/danbri/Downloads/geopolitical_v09.owl with parser rdfxml rapper: Serializing with serializer ntriples rapper: Parsing returned 12776 triples 212 (I'm not the only one who does their rdf querying with grep, right?) 212 is at least in the right ballpark, and FAO's data makes various kinds of careful distinction if you dig properly into the data. cheers, Dan On Tue, Apr 27, 2010 at 10:13 PM, Paul Houle < > wrote: I took another look at the dbpedia ontology types and found something else disturbing:  dbpedia has an order of magnitude more 'Countries' than most authorities believe exist,  for instance Dan uAnd what about this : 248 countries, with ISO-3166 codes + geonamesID from which you can build the URI in Granted, this obscure but precious text file would need an RDF mirror. Bernard 2010/4/28 Dan Brickley < > uDan Brickley wrote: uBernard Vatant wrote: similar data can be fetched from freebase e.g. and individual country data can then be exported in rdf to be joined with dbpedia: -jg uIt figures there is still a lot of linkage to do in this space, and a lot of redundant efforts. Hopefully all those will someday soon me meshed in a Global Giant Geo Data Space, *courtesy of Kingsley* :) Bernard 2010/4/28 John Giannandrea < > uHi Paul, On 27.04.2010 22:13, Paul Houle wrote: The rdf:types in DBpedia are based on the template to ontology mappings which are maintained in the mappings wiki ( As of DBpedia 3.5 the \"Infobox former country\" template is mapped to the ontology class Country. E.g. the SFR Yugoslavia ( has been a country once, that is a fact. So if you want to filter current countries you might want to check whether they don't have a dissolution year ( The \"former\" fact might be useful to be added as a triple in the future, but the ontology class should remain Country in my opinion. Cheers, Anja uBernard Vatant wrote: Yep! :-) John: this is a serious project, and happy to collaborate as always! Kingsley uAnja Jentzsch wrote: I've got two personal selfish interests here that conflict. On one hand I'm developing proprietary tools for dealing with \"real life\" ambiguous data, so the harder it is for other people to work with generic databases, the more of an advantage I have. On the other hand, I see a lot of value in participating in a vital linked data community, so another part of me wants linked data to be easy so that people use it. It ought to be dead simple to write a SPARQL statement to do something like (i) Make a list of the top 20 countries by order of GDP, or (ii) Make a list of the top 10 cities in the world by population Now, there are quite a few \"former countries\" that cast a serious shadow on the modern world and still show up in discourse. For instance, the \"DDR\" is a part of Germany that still has a distinct character (for one thing, I like the cookies you can get in East German bakeries better.) Plenty of people are sloppy today and talk about \"Czechoslovakia,\" which hasn't existed for some time. The \"Former Soviet Union\" is still a relevant geographical division as well. Dig a little deeper than that and you find that there are some real controversies as to what \"countries\" currently exist as well. Perhaps the strangest and most significant case is that Taiwan and the PRC don't recognize another and as a result, Taiwan is not a member of the U.N. Eritrea has an undefined border with Ethiopia, but has an ISO digraph. There are a number of \"territories\" such as Somaliland, Abkhazia, South Ossetia and Transnistria that claim sovereignty, levy taxes, and some level of diplomatic recognitions with some countries. There have also been a number of projects where people have tried to start countries in obscure places because they wanted to create a tax jurisdiction or hoped to get an ccTLD. If you want to split hairs as to what is a country or not, you can go a pretty long way: people try to settle these disputes with deadly weapons all the time. Personally I'm going to stick to the ISO 3166-1 list which is workable, if polluted with barely-inhabited islands. To work entirely in dbpedia it would be nice to have ISO 3166-1 digraphs in there. The infoboxes contain ccTLDs that are almost coded consistently and contain the usual bits of inconsistency that people introduce (for instance, the country with the \"UK\" domain name has an ISO digraph of \"GB\") There is a list in wikipedia of the iso digraphs which could probably be extracted as some kind of list uOn 29 April 2010 01:09, Paul Houle < > wrote: Freebase would have a significant advantage with person detectors, particularly for living people, because there is a large focus on Wikipedia to make sure that articles about living people are regularly patrolled due to legal issues. So every living person article is supposed to have an infobox on its talk page that contains \"{{WPBiography|living=yes|\". Many non-living people may have \"{{WPBiography|living=no|\", which is just as easy to identify. If you want data quality changed, it would be best for you to focus on Wikipedia rather than DBpedia IMO. Have you tried talking to the people at [1] about your concerns? Cheers, Peter [1] Wikipedia:WikiProject_Cities uPeter Ansell wrote: uKingsley Idehen wrote: I'll look into what it takes to fix wikipedia. Methodologically, however, the last kind of query that I want to run over DBpedia live (or freebase) is a query like \"give me a list of all cities;\" that kind of query almost certainly overruns the returned limit of either the SPARQL or MQL implementation. If I end up rebuilding 'Isidore', my framework for combing through lots of records in generic databases, it's going to be around a 'taxonomic core' that keeps track of identifying identification for objects and of the major classes that organize them. That's something that I like building from a general dump, because I feel more confident that I didn't run into 'data holes' hidden by an API. Practically, a complete taxonomic core for dbpedia and Freebase is reasonably small and easy to handle. Once I've got it, I can (in principle) identify classes of objects that I'm interested in and then use SPARQL or MQL APIs to fill in details about particular ones. For instance, for ny-pictures, I found about 10,000 'things' related to New York City and loaded those into a system that's pretty heavyweight and expensive. I found both dbpedia live and live access to freebase to be very useful for filling out information that was missing from the dbpedia dump that I started ny-pictures from; in particular, a lot of geographical coordinates have been added in the last few month. Along these lines, I really like the topic dumps that metaweb publishes; these are good 'map' of Freebase which makes it possible to selectively pick out stuff to look at with MQL, without having to figure out how to interpret those big FB dump files. uPaul Houle wrote: Build a Data Window using OFFSET and LIMIT. Of course :-) Yes. Would be nice if the end product made its way back into the LOD cloud. Hopefully, when we are set with RDF delta shipping (which basically enables DDE or yore but for Linked Data via pubsubhubbub) issues like this will be easier to handle." "categories with a comma in their names" "uHi all, Do you have any idea how to query for categories with a comma in their names like , because it is not accepting in the normal form." "Official DBpedia Live Release" "uDear all, the AKSW [1] group is pleased to announce the official release of DBpedia Live [2]. The main objective of DBpedia is to extract structured information from Wikipedia, convert it into RDF, and make it freely available on the Web. In a nutshell, DBpedia is the Semantic Web mirror of Wikipedia. Wikipedia users constantly revise Wikipedia articles with updates happening almost each second. Hence, data stored in the official DBpedia endpoint can quickly become outdated, and Wikipedia articles need to be re-extracted. DBpedia Live enables such a continuous synchronization between DBpedia and Wikipedia. The DBpedia Live framework has the following new features: 1. Migration from the previous PHP framework to the new Java/Scala DBpedia framework. 2. Support of clean abstract extraction. 3. Automatic reprocessing of all pages affected by a schema mapping change at 4. Automatic reprocessing of pages that are not changed for more than one month. The main objective of that feature is to that any change in the DBpedia framework, e.g. addition/change of an extractor, will eventually affect all extracted resources. It also serves as fallback for technical problems in Wikipedia or the update stream. 5. Publication of all changesets. 6. Provision of a tool to enable other DBpedia mirrors to be in synchronization with our DBpedia Live endpoint. The tool continuously downloads changesets and performs changes in a specified triple store accordingly. Important Links: * SPARQL-endpoint: * DBpedia-Live Statistics: * Changesets: * Sourcecode: * Synchronization Tool: Thanks a lot to Mohamed Morsey, who implemented this version of DBpedia Live as well as to Sebastian Hellmann and Claus Stadler who worked on its predecessor. We also thank our partners at the FU Berlin and OpenLink as well as the LOD2 project [3] for their support. Kind regards, Jens [1] [2] [3] http://lod2.eu u[Limiting CC list] also interested in this question. On a related note, will there be intermediate solution? Thanks, Tom uHi Tom, On 06/24/2011 02:45 PM, Tom Heath wrote: Thank you for feedback, this issue is now fixed. uHi Tom, On 24.06.2011 14:45, Tom Heath wrote: Currently, no. (Of course they will reflect the latest changes every few months in case of a release.) No, the endpoints will run in parallel (for now at least). Some explanation: There are two extraction modes of DBpedia: * dump based ( * live (via update stream) The dump based extraction is performed in many (>90) languages. It generates all the files at are loaded in the official endpoint ( the English Wikipedia, but also labels and abstracts in different languages. The live version of DBpedia currently works on the English Wikipedia edition and does not generate dumps. Currently, we plan to run both in parallel, so the live version does not supersede the static dump based extraction. Of course, anything we are doing is open for discussion and we welcome suggestions. I'll post other replies on the DBpedia mailing list to avoid too much cross mailing list traffic. Kind regards, Jens uHello, On 24.06.2011 14:52, Thomas Steiner wrote: That's a valid and good question, which is, however, not that easy to answer. For now, we went the simple route and do not serve DBpedia Live data as Linked Data, although I see that it would be desirable to have it. If we serve it from implies changing the resource URIs accordingly (prefix static URIs. A question would be whether it is desirable to have two URIs for exactly the same thing from exactly the same source? If we would decide to have different URIs for the static and live version, then a related question is whether it is better to use - or - The latter requires more changes on our (OpenLink, FUB, AKSW) side, but might be more plausible in the mid/long term. Another option would be to use a single URI and a content negotiation mechanism, which can deal with time ( which would however introduce additional complexity. Input/opinions on those issues are welcome (if there is a best practice for this case, please let us know). Kind regards, Jens uDear Jens, DBpedians, Treating Web page (all Semantic Web / Linked Data foo aside) that has a temporarily more up-to-date version of encouraged to use long-term, the current best practice (from a search engine's point of view) is to place a so-called canonical link [1] on a meta tag in the head section of the page, or (I guess in this case the preferred solution) via a Link header in the HTTP header. What do you think? Best, Tom [1] answer.py?answer=139394 uOn 6/24/11 4:38 PM, Jens Lehmann wrote: Jens, Re. Linked Data do remember what's already in place (as part of the hot staging of this whole thing) at: When that was constructed it included Linked Data deployment, naturally. Since Virtuoso is a common factor, its a VAD install to get a replica via live.dbpedia.org . Anyway, I know this is early days on the live.dbpedia.org side of things, and this is more about a SPARQL endpoint than entire Linked Data deliverable. Anyway, when it comes to Linked Data and all the other questions posed above, best to first look at what's already been done (over a year now) re: http://dbpedia-live.openlinksw.com :-) Examples (note: owl:sameAs relations): 1. http://dbpedia-live.openlinksw.com/resource 2. http://dbpedia-live.openlinksw.com/page/Slightly_Odway 3. http://dbpedia-live.openlinksw.com/describe/?url=http://dbpedia.org/resource/Slightly_Odway . Again install DBpedia vad and the issue of canonical URIs vanishes, the descriptor pages work (for humans or machines), and re. actual Linked Data is all good, uOn 6/24/11 5:08 PM, Kingsley Idehen wrote: Should have been: 1. 2. 3. Slightly_Odway uI take the point of view that Linked Data are claims, rather than facts. Claims are made by different people/datasources, possibly conflicting, and the consumer decides what/who to believe. I think that both dbpedia.org and live.dbpedia.org should provide claims about the same URIs, without requiring the sameAs indirection. In this case, I would choose to see contains assertions about abbreviate to live:Slightly_Odway and dbpedia:Slightly_Odway respectively. An HTTP request to live:Slightly_Odway would return a description of a graph named live:Slightly_Odway, which in turn has quadruples about dbpedia:Slightly_Odway. dbpedia:Slightly_Odway rdf:type ns7:DebutAlbums live:Slightly_Odway . dbpedia:Slightly_Odway rdf:type ns7:JebediahAlbums live:Slightly_Odway The fact that it returns a description of itself complies with Linked Data principles. The fact that these triples are talking about another URI may look unconventional at first, but it's actually common in the wild. See [1]: yago-res:Slightly_Odway owl:sameAs ns2:Slightly_Odway . We might need to add some mechanism to convey in triples the same message of the quadruples, or we may assume that if somebody asks for NT, they do not care about the provenance, and just serve triples about dbpedia:Slightly_Odway. If they ask for NQ, then we give them the quadruples. This solution would allow people to easily integrate data in a simple query, retaining the ability of telling apart the sources, without requiring inference. SELECT * WHERE { GRAPH ?dbpedia { dbpedia:Slightly_Odway ?p ?o . } GRAPH ?live { dbpedia:Slightly_Odway ?p ?o . } uI like this solution. Especially if a request to Pablo On Fri, Jun 24, 2011 at 5:55 PM, Thomas Steiner < > wrote: uThey are assertions. IIRC the terminology was a bit like: statement: 'The moon is made of cheese' assertion (I now make a claim about the world, for example by publishing a respective RDF triple ): 'The moon is made of cheese' fact: turns out to be a wrong assertion (as far as we know modulo the moon landing conspiracy story) Nearby: [1]. Now waiting to get beaten up by Pat and or Alan for my naivety w.r.t. discrete math ;) Cheers, Michael [1] uAll, The Virtuoso DBpedia package driving the /page, /resource and /data endpoints already contains this feature for the last couple of years, both in form of Link headers as well as META tags. It was designed from the start to provide provenance links, not only to the original Wikipedia article, but also to the canonical dbpedia.org for those who want to setup their own dbpedia database in- house or on Amazon EC2. Please use the URI Debugger from Berlin University to see what i mean: which shows the data behind the link: Also check out the \"RDF extraction\" feature on this URI debugger page. Note that previous version of the dbpedia \"live extraction code\" but will soon be updated to use the new extraction framework. Patrick uHello Kingsley, On 24.06.2011 18:08, Kingsley Idehen wrote: Of course we are aware of this, but were more focused on getting the live extraction framework and endpoint working properly. We are, of course, also very happy to have two working DBpedia live endpoints (with the OpenLink one being even more powerful on hardware I guess). We had quite some internal discussions and decided to use the VAD for now. Mohamed installed it, so there is now Linked Data for DBpedia Live: (Whether we will keep using this URL scheme in the future still needs to be decided.) Kind regards, Jens uOn 7/6/11 1:35 PM, Jens Lehmann wrote: Great! Note, the URI style above is just an indirection to the canonical URI. To see what I mean, execute the following via SPARQL endpoint: 1. describe 2. describe . You only get data for #2. Thus, the solution is simply delivering an indirection via HTTP level (typically for page readers) while keeping the actual DBpedia canonical URI and the data its delivers from this instance intact. If a service seeks to mesh data from both the static and live dbpedia instances (at least for now) this is where SPARQL-FED or an owl:sameAs and resulting inference based union expansion would come into play, as options." "how to get the prefix results for a resource via sparql query?" "uHi, I see the triples for a resource with the prefix , by visiting a url for the resource in the format However, when I issue a sparql query against the Is there a way in which I can query the live.dbpedia.org sparql endpoint for triple results that use the prefix ? I am wondering if I am missing some triple results because of not building my sparql query correctly, as I am only able to get the triples with prefix ThanksArun uHi Arun, On 02/25/2013 04:10 AM, Arun Chippada wrote: The issue is fixed now and both methods should give the same results now. uThanks Morsey. Yes, I am able to see that both the methods are now returning the triple results with the prefix , for live.dbpedia.org. Do both of the methods (sparql query against endpoint and dereferencing via ThanksArun Date: Mon, 25 Feb 2013 21:06:44 +0100 From: To: CC: Subject: Re: [Dbpedia-discussion] how to get the prefix results for a resource via sparql query? Hi Arun, On 02/25/2013 04:10 AM, Arun Chippada wrote: Hi, I see the triples for a resource with the prefix , by visiting a url for the resource in the format for the movie \"Silver linings playbook\", I see several triples with the prefix by visiting However, when I issue a sparql query against the the returned results are using the prefix and I was never able to get any triples with the prefix . For example, for the below sparql query, all of the returned triples were using the prefix Is there a way in which I can query the live.dbpedia.org sparql endpoint for triple results that use the prefix ? I am wondering if I am missing some triple results because of not building my sparql query correctly, as I am only able to get the triples with prefix Thanks Arun The issue is fixed now and both methods should give the same results now. uHi Arun, On 02/26/2013 01:57 AM, Arun Chippada wrote:" "Duplicate triples logged as deleted at DBPedia-Live" "uHi, I was checking the live updates log, specifically this one: I noticed that it has a lot of repeated triples, e.g. < http://dbpedia.org/property/length> \"2740.0\"^^< http://dbpedia.org/datatype/second> . < http://dbpedia.org/property/type> . I know that technically it does not affect the storage, but I would like to know what is the reason of this \"dup\". Thanks in advance, uHi Luis, On 02/28/2012 05:19 PM, Luis Daniel Ibáñez González wrote: Thank you for spotting that issue, and we will fix it soon. uHi Luis, On 02/28/2012 05:19 PM, Luis Daniel Ibáñez González wrote: The problem is now resolved, and thanks for your reporting." "Hello DBPedia!" "uHi! First, a brief introduction. My name is Roberto Alsina, and my team at Canonical is using DBPedia in the upcoming ubuntu touch phone operating system to improve a suggestions engine. What we are doing is, when a user searches for something, we look it up in wikipedia, and then use the entity name in dbpedia to get properties, which we then associate with different results. For example: User types \"Metallica\" => Wikipedia matches => DBPedia says \"Type::band\" => We suggest searching in grooveshark and youtube All in all, the approach works remarkably well, but we are finding some missing mappings, and we'd like to help improve DBPedia :-) For example: most actors don't have occupation::Actor. Or, publicly traded companies (example: Microsoft) have a \"Traded as\" field in their infoboxes but no matching data in DBPedia. For the latter, adding mappings in Looking forward to working on this :-) Hi! First, a brief introduction. My name is Roberto Alsina, and my team at Canonical is using DBPedia in the upcoming ubuntu touch phone operating system to improve a suggestions engine. What we are doing is, when a user searches for something, we look it up in wikipedia, and then use the entity name in dbpedia to get properties, which we then associate with different results. For example: User types 'Metallica' => Wikipedia matches => DBPedia says 'Type::band' => We suggest searching in grooveshark and youtube All in all, the approach works remarkably well, but we are finding some missing mappings, and we'd like to help improve DBPedia :-) For example: most actors don't have occupation::Actor. Or, publicly traded companies (example: Microsoft) have a 'Traded as' field in their infoboxes but no matching data in DBPedia. For the latter, adding mappings in" "Add your links to DBpedia workflow version 0.1 (this is also an RFC)" "uDear all, we thought that it might be a nice idea to simplify the workflow for creating outgoing links from DBpedia to your data sets. This is why we created the following GitHub repository: Please feel free to add new files and change the links and then send us a *pull request*. This message is an announcement as well as a request for comments. Here is a (non-exhaustive) list of open issue: - it is yet unclear, when the links will be loaded into - we plan weekly updates to - yago, freebase and flickrwrappr have been excluded due to their size ( > 0.5GB ) - there will be some quality control; not everybody will be able to include any links he wants to include. We are open to ideas how to manage this. Consider \"pull requests\" as \"application for inclusion\" - folder/file structure is still very simple, we will adapt upon uptake All the best, Sebastian" "Missing package !" "uHello everyone, I am trying to find a java package (\"jena_sparql_api\") which was developed as a wrapper to simplify sparql queries. However, I am not able to find a package for Jena_SPARQL_api which had cache\core and cache\extra. Please let me know where can I download the same. Thanking you in advance. Regards, Ankur. Hello everyone, I am trying to find a java package ('jena_sparql_api') which was developed as a wrapper to simplify sparql queries. However, I am not able to find a package for Jena_SPARQL_api which had cache\core and cache\extra. Please let me know where can I download the same. Thanking you in advance. Regards, Ankur. uCheck There you find how to set up your maven pom.xml. If you don't use maven, you can find the jars in the repository that is named there. Best /Magnus Am 08.03.2014 um 09:02 schrieb Ankur Padia:" "Project Announcement: Infobox Mapping" "uHi everyone, I'm Peng and a first-year M.Sc. in University of Alberta. My GSoC project is \"Inferring Infobox Template Class Mappings From Wikipedia and WikiData\". I've had a discussion with my mentor Nelish and figured out the whole picture of the work this summer. I plan to maintain the project public page on my own blog: public page . I will update my progress on this page every week. I'm glad to make my own efforts to DBpedia this summer. Any advice on my project will be appreciated. Thanks! Best Regards Peng Xu" "When will be the next release of DBPedia 3.8?" "uHello, Forks, My question is as the subject. Sorry for the bothering if there is already an answer before. Rong a new comer to DBPedia Hello, Forks, My question is as the subject. Sorry for the bothering if there is already an answer before. Rong a new comer to DBPedia uHi Rong, sorry for the late reply - our target for the DBpedia 3.8 release is late April. Also see this question on stackoverflow: Regards, Christopher On Fri, Mar 23, 2012 at 04:56, Rong Shen < > wrote:" "LinkedUp Veni Competition: Linked and Open Data for Education" "uApologies for cross-posting please circulate to anyone who may be interested" "semantic in URIs, was:dbpedia-links: Recommendation for predicate "rdrel:manifestationOfWork" ?" "uAm 07.05.2013 08:54 schrieb Mathieu Stumpf : it is not me who 've minted the predicate. If you want to know, just follow the link I think you mix something up. Don't put semantics into URIs! every predicate is a URI and thus has to follow the URI conventions[1], and that's it. From that, it is totally arbitrary and semantic agnostic. A label, on the other hand, would be something like skos:prefLabel where you get the natural language label (as values of skos:prefLabel predicate). This can be used for the presentation level of your app. See [2] for an example, 1. as an arbitrary URI and 2. look at the skos:prefLabel where you find the preferred labels for this URI in different languages. While URIs are semantically agnostic I personally find it convenient to have something speaking like dct:title in comparison to isbd:P1006 . And I would prefer english even if its not my native language, because English is todays lingua franca. hope this helps, oo [1] [2] T1001 uHi Mathieu, I think you have several questions and I will try to answer them: 1. CamelCase is easier to read. You can look at concatenated strings and see the words immediately 2. I18N: well, I have to admit, it is a little bit unfair to use English names in URIs, but these are more for developers. Normally, you wouldn't have any special chars in your Java or Python method names: \"public void übersetzen () {} \" would be very unusual. However, internationalization is a real problem, here are pointers: There is a paper about this here: and also a W3C group (you are welcome to join): 3. The discussion whether to use readable URIs or IDs is very old. I couldn't find a good thread, but you might start here: Both have advantages, IDs are more scalable and also you can have versioning. The readable URIs are better for writing queries by hand. Sebastian Am 07.05.2013 14:48, schrieb Mathieu Stumpf:" "dbpedia liveupdates" "uDear All, I am working on collecting the Films/TVShows from dbpedia and all their relevant corresponding properties ( starring, directors, language,) as well as their abstracts, image and wikipdiaIDs. I downloaded the dbpedia3.9 but looks like starting Nov 2013 , they have started putting the liveupdates online. I am interested to get the recent movies/TVShows. My problem is that for each they there are way too many added and removed daily update files. I have two questions 1- Is there a single daily dump that includes all added and removed files? 2- Do the movie liveupdates include all the properties that the named Entities from the official release do? For example there is an infobox_properties file that includes some info about the dbpedia entries, so how about once a new entry is introduced, what happens to these infobox info? Are they included? Thank you Dear All, I am working on collecting the Films/TVShows from dbpedia and all their relevant corresponding properties ( starring, directors, language,) as well as their abstracts, image and wikipdiaIDs. I downloaded the dbpedia3.9 but looks like starting Nov 2013 , they have started putting the liveupdates online. I am interested to get the recent movies/TVShows. My problem is that for each they there are way too many added and removed daily update files. I have two questions 1- Is there a single daily dump that includes all added and removed files? 2- Do the movie liveupdates include all the properties that the named Entities from the official release do? For example  there is an infobox_properties file that includes some info about the dbpedia entries, so how about once a new entry is introduced, what happens to these infobox info? Are they included? Thank you uDear Hamid For more details on how DBpedia Live works I suggest you read the related publication [1]. Regarding the updates, we offer the dbpintegrator tool [2] that can sync your local triple store with DBpedia Live. This is the suggested option if you plan to make heavy use of the endpoint. Once you have the syncing done,we now provide the following properties: - dbpedia-owl:wikiPageExtracted - dbpedia-owl:wikiPageModified that you can use to check for newly extracted pages. DBpedia Live gets all information from the articles except from abstracts and images. Abstracts will be supported soon while for images we cannot offer real-time updates due to the current image Extractor architecture. Best, Dimitris [1] Mohamed Morsey, Jens Lehmann, Sören Auer, Claus Stadler, Sebastian Hellmann, (2012) «DBpedia and the live extraction of structured data from Wikipedia\", Program: electronic library and information systems, Vol. 46 Iss: 2, pp.157 – 181 [2] On Thu, Mar 13, 2014 at 10:17 PM, Hamid Ghofrani < > wrote:" "Is yago2class dataset for download a reduced dataset£¿" "uHi all,Currently we are developping an application based on DBPedia entities and YAGO classes. The main source dataset is \"Links to YAGO\" on page ¡° According to YAGO:\"Conceptually, the hypernymy relation in WordNet spansa directed acyclic graph (DAG) with a single root node called entity\" @NameOfTheOntology,Now considering the YAGO2 dataset for class hierarchy, is it a reduced set(without dubpicated items:the third line) like the following case?class1 subclass class3class2 subclass class3class1 subcalss class3 (can be deduced from the first 2 triples ) @dbpedia-dicussion, Regarding the two hypothesis(1.YAGO2 did provid reduced dataset, 2.YAGO2 didn't ), would your extraction algorithm lead to the following items in \"Links to YAGO\" dataset?entity1 type leafClass1leafClass1 subclass upperClass1entity1 type upperClass1 (can be deduced from the first 2 triples) PS: In fact, according to our statistics extracted from \"Links to YAGO\", there are 1042 classes both considered as leafClass and upperClass.That's why we doubt there may be duplicated items in this dataset. Your help would be great appreciated! Thanks in advance.BRIvan Lv from China" "Missing information in the Arabic chapter" "uDear all; I am writing a small program to find similar triples in the ar.dbpedia.org graph. I successfully found many resources and extracted their URIs (or IRIs), but when I tried to open some of them (well a lot of them) on the browser, I get this result: \"No further information is available. (The requested entity is unknown)\" No information is shown what so ever. My question is this a problem with my code, or the data about resources isn't available for the Arabic chapter yet? It is worth mentioning that when I opened some of the URIs there is data. Examples of URIs: * uDear Ahmed, Try your code with resources like: and see the result if it is work or not. These resources already mapped and you can make queries on the Arabic Chapter using the SPARQL endpoint. Best Regards, Dr.Haytham Al-Feel From: Ahmed Ktob < > Sent: Thursday, May 5, 2016 5:08:11 PM To: Subject: [Dbpedia-discussion] Missing information in the Arabic chapter Dear all; I am writing a small program to find similar triples in the ar.dbpedia.org graph. I successfully found many resources and extracted their URIs (or IRIs), but when I tried to open some of them (well a lot of them) on the browser, I get this result: \"No further information is available. (The requested entity is unknown)\" No information is shown what so ever. My question is this a problem with my code, or the data about resources isn't available for the Arabic chapter yet? It is worth mentioning that when I opened some of the URIs there is data. Examples of URIs: ÇäÑÍêâ_ÇäåÎÊèå uDear Haytham, Thank you so much for your quick reply. I have tried the URIs you gave, they are working fine, and as I have mentioned in my previous post, many of the URIs I discovered are working fine. And this is because the content from wikipedia is currently mapped. The issue here is that some of the URIs I gave as examples (for instance: Wikipedia [1], but the has no content in ar.dbpedia.org. I wonder why? Maybe when extracting some of the templates were left off. Best regards, Ahmed. [1] On 7 May 2016 at 02:14, Dr.Haytham Al-Feel < > wrote:" "DBpedia Project Update and Freebase Interlinking" "uHi all, a quick update on what is happening around DBpedia along the lines of: 1. Freebase 2. Data Quality 3. Live Data There were great news from Freebase at ISWC. They are now providing a Linked Data interface which makes the complete content of Freebase accessible to the Semantic Web. This is especially exiting for DBpedia as both datasets have a large overlap and having RDF access to Freebase makes it easy to mashup and fusion both datasets. We are currently in the process of generating links from DBpedia to Freebase for all 2.49 million things in DBpedia. These links will go online sometime next week and will immediately allow to mash Freebase and DBpedia data for instance using tools like the Marbles Linked Data browser (which does owl:sameAs smushing). The links could also be the foundation for further work on fusing Freebase and DBpedia data, which I think will be very exiting and might show that the Semantic Web itself is developing into the world's database being fuelled by various valuable sources. There are also good news concerning DBpedia's two main problems: Low data quality and stall data. Georgi, Anja and Paul are getting close to publish a new cleaned-up DBpedia dataset based on the current Wikipedia dump. This extraction uses a new framework based on manual mappings of hundreds of Wikipedia templates to a clean ontology and improved datatype extraction algorithms. The new dataset is supposed to be released next week and should be clean enough to allow RDFS subsumption reasoning as well as to use it within facet browsering UIs. There is also great progress towards getting the DBpedia dataset current and synchronise it with Wikipedia changes: Sören managed to convince the Wikipedia foundation to give us access to the Wikipedia live update stream, which tracks all changes in Wikipedia itself. Thanks a lot to the foundation for this! This is exactly what we needed. Based on this update stream we can sync DBpedia and Wikipedia, which will mean about 20 000 updates to the DBpedia dataset per day. Orri from OpenLink meant that this is no problem for the Virtuoso server which is used to host the DBpedia SPARQL endpoint and Linked Data interface. Thus after the new dataset is released, we will look into extending the extraction framework for continuous updates and are looking forward to be able to server a live version a DBpedia soon. Cheers Chris" "DBPedia and thumbnail images" "uHello all, When interacting with the DBPedia sparql endpoint, I usually have a need for loading a thumbnail of a topic in addition to the other information about it. For example, Films often have a thumbnail image associated with them, but so far I haven’t found any thumbnails of films that actually return a working image. I.e. The Hobbit has a broken image associated with it: Is there any plan to fix this behavior? Also, what are the challenges associated with pulling data from wikipedia that may be time sensitive (in this case, images that were once there, that are no longer there). Regards, Kristian Alexander PS: When I first encountered this issue, I asked a question about it on stack overflow, and there isn’t a good solution to it, so if anyone does have a solution, feel free to post it there. dbpedia-owlthumbnail-image-paths-are-broken" "Downtime estimate available?" "uHi, Just wondering if there is an estimate on how long will be down for. I am not sure when it first went down but just wondering if there is maintenance or something going on for a known amount of time. Cheers, Peter Ansell uHello Peter, Peter Ansell schrieb: The wiki itself is hosted on one of our servers in Leipzig. You can reach it via should be a redirect to the Wiki.) The linked data interface has undergone some changes last week, which may be the reason why it is down now. We hope OpenLink will fix this issue soon. Kind regards, Jens uThanks, I was referring to the sparql endpoint and responder for Just wondering what the situation was :) Cheers, Peter 2008/7/21 Jens Lehmann < >: uHi Peter, The Dbpedia services are back online Best Regards Hugh Williams Professional Services OpenLink Software On 21 Jul 2008, at 08:22, Peter Ansell wrote:" "How DBpedia Extraction Framework in Windows 7" "uDear Adrian, I found a discussion between you and Mr.Dimitris on this link: for installing DBpedia extraction framework in windows. I spent very long time trying to run it on windows, but unfortunately I couldn't run it :( . So, can you please explain to me how can I run DBpedia Extraction Framework in Windows 7 64bit? Many thanks in advance! With Regards,Abdullah u uDear Adrian, Many thanks for your help, I really appreciate it. I will try to follow your instructions in order to run DBpedia Extraction Framework my machine and I will let if I have any problem. Many thanks. With Regards,Abdullah Date: Thu, 25 Jul 2013 12:58:36 -0700 From: To: CC: ; ; Subject: Re: [Dbpedia-discussion] How DBpedia Extraction Framework in Windows 7 Dear Abdullah, I answered you on your private mail. Best regards,Adrian On Thursday, July 25, 2013 9:06:52 PM UTC+2, Abdullah Nasser wrote: Dear Adrian, I found a discussion between you and Mr.Dimitris on this link: for installing DBpedia extraction framework in windows. I spent very long time trying to run it on windows, but unfortunately I couldn't run it :( . So, can you please explain to me how can I run DBpedia Extraction Framework in Windows 7 64bit? Many thanks in advance! With Regards,Abdullah uDear Adrian, While I'm trying to run DBpedia Extraction Framework in my machine (Windows 7 64bit) and when I reach to this step: \"Once everything is compiled, in the same maven's window you have to select “DBpedia dump extraction” ->Plugins -> scala ->scala:run\" , which is in the third section of installing DBpedia Extraction Framework , I got this error: \"C:\Program Files\Java\jdk1.7.0_07\bin\java\" -Dmaven.home=C:\maven -Dclassworlds.conf=C:\maven\bin\m2.conf -Didea.launcher.port=7533 \"-Didea.launcher.bin.path=C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 12.1.4\bin\" -Dfile.encoding=UTF-8 -classpath \"C:\maven\boot\plexus-classworlds-2.4.jar;C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 12.1.4\lib\idea_rt.jar\" com.intellij.rt.execution.application.AppMain org.codehaus.classworlds.Launcher uDear Abdullah, I only used download and extract goalsAs far as I understand Import is only needed for importing current wikipedia dumps into mediawiki for extracting abstracts. I have never used this goal on Windows. I don't need the abstracts currently so I never bothered with Import. My goal was just to create newer dumps without much headache and see how easy/hard it is to create new custom mappings using Scala. However please be aware that \home\release\wikipedia\wikipedias.csv - is a Linux path, not a Windows (C:\). As I said all paths need to be Windows paths if you want it to work. So please have a look at the paths from this file: (essentially the POM file for the dump sub-project)and notice that basically all the paths are Linux paths instead of the classic Windows paths (like C:\dumps\) If you want to extract the abstracts you also need to install MediaWiki on Windows and fix all those paths as far as I can tell. Best regards, Adrian On Sat, Jul 27, 2013 at 5:39 PM, Abdullah Nasser < > wrote: uDear Adrian, Many thanks for your help. Can you please tell me what should \"\home\release\wikipedia\" change to in windows? Because I don't know exactly what should I change this path in windows. For example I tried this: C:/Users/ANM/extraction-framework/extraction_framework/.hg/extraction_framework/dump but i'm not sure whether it is correct or not. With Regrads,Abdullah Date: Sat, 27 Jul 2013 17:52:19 +0200 Subject: Re: [Dbpedia-discussion] How DBpedia Extraction Framework in Windows 7 From: To: CC: ; Dear Abdullah,I only used download and extract goalsAs far as I understand Import is only needed for importing current wikipedia dumps into mediawiki for extracting abstracts.I have never used this goal on Windows. I don't need the abstracts currently so I never bothered with Import. My goal was just to create newer dumps without much headache and see how easy/hard it is to create new custom mappings using Scala. However please be aware that \home\release\wikipedia\wikipedias.csv - is a Linux path, not a Windows (C:\). As I said all paths need to be Windows paths if you want it to work. So please have a look at the paths from this file: (essentially the POM file for the dump sub-project)and notice that basicallyall the paths are Linux paths instead of the classic Windows paths (like C:\dumps\) If you want to extract the abstracts you also need to install MediaWiki on Windows and fix all those paths as far as I can tell. Best regards, Adrian On Sat, Jul 27, 2013 at 5:39 PM, Abdullah Nasser < > wrote: Dear Adrian, While I'm trying to run DBpedia Extraction Framework in my machine (Windows 7 64bit) and when I reach to this step: \"Once everything is compiled, in the same maven's window you have to select “DBpedia dump extraction” ->Plugins -> scala ->scala:run\" , which is in the third section of installing DBpedia Extraction Framework , I got this error: \"C:\Program Files\Java\jdk1.7.0_07\bin\java\" -Dmaven.home=C:\maven -Dclassworlds.conf=C:\maven\bin\m2.conf -Didea.launcher.port=7533 \"-Didea.launcher.bin.path=C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 12.1.4\bin\" -Dfile.encoding=UTF-8 -classpath \"C:\maven\boot\plexus-classworlds-2.4.jar;C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 12.1.4\lib\idea_rt.jar\" com.intellij.rt.execution.application.AppMain org.codehaus.classworlds.Launcher uHello Abdullah, This path: \"\home\release\wikipedia\" actually means the \"home\" directory of the \"release\" user on Linux. On Linux/Unix/MacOs they do not have the typical C:/, D:/, E:/ drivesbut rather a file system that mounts the folder directly so you need to have an easy way to find your folders(therefore \home\release\ would be just fine). You can as well put C:/wikipedia/ on windows for thatHowever I really have no idea what folder you need to use for MediaWiki under Windows. As I told you I was not interested in the abstracts Also somehow this path C:/Users/ANM/extraction-framework/extraction_ framework/.hg/extraction_framework/dump doesn't seems quite rightIt's just a feelingbut you should not have any .hg folders (related to old Mercurial dependencies I guess) At least as far as I remember I had no similar pathhmmI'm a bit lost here All my paths were clean C:/extraction-framework/dump directly Best regards, Adrian On Sat, Jul 27, 2013 at 10:21 PM, Abdullah Nasser < >wrote:" "Unexpected answer to simple query" "uHi All, I put to query: select ?x where {?x skos:broader ?x.} the second line in an answer is but in Wikipedia itself page does not have itself as a subcategory. What is wrong here? Alex Hi All, I put to Alex" "Help on forming this query (getting color names)" "uHi everyone, I have a question with regards to forming a SPARQL query for DBPedia. I'm new to SPARQL and some help here will really be appreciated. What I'm trying to do is to pull off the list of colors, as well as their corresponding hex values off of this resource: My test SPARQL query to grab all of the content in there looks like this: SELECT ?property ?hasValue WHERE { ?property ?hasValue } And it looks good; I'm getting color names nicely split into separate elements: The problem is, the corresponding hex values are dumped further down in separate rows: http://dbpedia.org/property/hex 002FA7 This list of colors are ordered by their hex value, and so do not match the same order of the list of color names. So my question is this: how will I be able to grab the matching hex values for each color? Thanks and regards, Andrew uHi Andrew, First, where List_of_colors doesn't seems to be on the dbpedia endpoint ( So I had to tweak it a bit (dunno if all colors are there): SELECT ?colorName ?hexValue from WHERE { ?color . ?color ?hexValue. ?color ?colorName. } What I hope is that all colors have the property \"rgbspace\" with \" \" has object. From there you get hex-values with names (titles). The top 25 results are: * Names are weird though Hope this helps. Take care, Fred * uThanks for the quick reply Frederick, Hmm, List_of_colors does exist, and I got the basic results via this query on dbpedia's endpoint: SELECT * WHERE { ?p ?o} Please try this link: Can you check and see if that works? This list has a lot more colors and corresponding hex values and is the reason why I'm querying it. I just don't understand how I'll get the separated names and hex values to come together. Thanks also for the example you gave; that's a great way to get the names but due to the inconsistencies of the wikipedia content there's not too many colors that match the query. However I'll see how I can work with your example on the above resource. If anyone has a solution to the same question, i.e. getting names+hex values from the List_of_colors resource, I will be eternally grateful! Cheers, Andrew On May 6, 2008, at 11:10 AM, Frederick Giasson wrote: uHi Andrew, Ooups, yeah, sorry. So, there is what I propose: SELECT ?colorName ?hexValue WHERE { ?o. { ?o ?colorName. ?o ?hexValue. } filter (lang(?colorName) = \"en\"). } The result is here: I agree that it looks like much better :) Take care, Fred *" "No abstract Field" "uDear DBpedia developers, I'm struggling with following SPARQL-Query. select distinct ?abstract where { < dbpedia-owl:abstract ?abstract } I'm trying to get the value of dbpedia-owl:abstract, but there are no results. On the corresponding dbpedia-page is text in the dbpedia-owl:abstract Field. Can somebody give me a hint, what I'm doing wrong? greetings Sebastian Dear DBpedia developers, I'm struggling with following SPARQL-Query. select distinct ?abstract where { < Sebastian uHi Sebastian, there's two issues here: 1) This query works only on the de.dbpedia.org/sparql endpoint, but not on dbpedia.org/sparql; I don't know which one you used. 2) For SPARQL, you have to use encoded URIs, so you have to rephrase your query to select distinct ?abstract where { dbpedia-owl:abstract ?abstract } These two tricks should do. Best, Heiko Am 22.07.2014 11:35, schrieb Sebastian Schwarz: uHi Sebastian, Thanks for reporting this issue, it seems we have some encoding issues in the Sparql endpoints. We should have it fixed in the next weeks when we update the datasets in the endpoints. Cheers, Alexandru On Mon, Jul 28, 2014 at 11:47 AM, Heiko Paulheim < > wrote:" "Links to Geonames" "uI'm a little perplexed re links to geonames: I downloaded see it has only 86,547 records; this is a very small subset of Wikipedia. I joined Geonames and Wikipedia beaches on name and lat/long and got + 30 matches. Why would these not be included? Am I missing something? Thanks for any help! Carlo I'm a little perplexed re links to geonames: Carlo uHello, Carlo Brooks wrote: For some link datasets in DBpedia, there is no proper update mechanism included in the DBpedia SVN repository. In such cases, the link data sets are copied from the previous release. For Geonames, this means that the links you see were not recently updated (and can be as old as one or two years). The best way to improve the situation, in case up-to-date links are important for you, is to add a script (or a SILK file etc.), which can efficiently compute the links between the two data sets to the SVN repository, such that it can be run regularly. Kind regards, Jens uI see. Is there a way to access the svn directly as opposed to downloading I am not certain there is a better file, but if there is I would just like to know where I can access it Thanks much, Carlo On Mon, Apr 19, 2010 at 10:45 PM, Jens Lehmann < > wrote: uHello, Carlo Brooks wrote: SVN can be accessed as follows: However, the SVN contains the extraction framework (and not the data sets generated by it), so you won't find another Geonames link file there. Kind regards, Jens uOn Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann < > wrote: Is there a list someplace of who is responsible for each of these link sets and when they were last updated? I think I remember reading somewhere that the Freebase links were in a similar situation. Also, the last time the links were done they were made to the GUID form of the Freebase identifier, which I'm not sure is the best target (conversely, Freebase generates DBpedia for *every* Wikipedia article name, including redirects and misspellings, which doesn't seem right either). Tom uHello, Tom Morris wrote: If you go to the download page and click on a data set, you get some information (or scroll to the bottom of the page): To see whether data sets have changed compared to previous releases, you can go to Please note that within the last year the extraction framework was rewritten and the live extraction was implemented. It's difficult to improve all aspects of DBpedia within a short timeframe and most interlinking data sets were never designed for long term maintenance, but rather one time efforts. (Anyone is invited to contribute mapping code to DBpedia, of course, to improve the situation.) Kind regards, Jens uOn Tue, Apr 20, 2010 at 3:22 PM, Jens Lehmann < > wrote: Thanks. I'd seen that. I was hoping for something more along the lines of an email address or a person's name. The entries in question are: \"Links to Freebase - Links between DBpedia and Freebase. Update mechanism: unclear/copy over from previous release. \"Links to Geonames - Links between geographic places in DBpedia and data about them in the Geonames database. Provided by the Geonames people. Update mechanism: unclear/copy over from previous release.\" I'm willing to help out with that, but it would seem like the people who did the original mappings are likely to have knowledge, and perhaps even code, from their previous efforts which would be highly applicable to the task. Is all that knowledge really lost forever? Tom uTom Morris wrote:" "Is this the right place to ask about weirdness with the faceted search interface for dbpedia?" "uApologies if this is not the right venue for this, pointers much appreciated if this is true. Also, I'm rather sorry if you get this twice as my email addresses and mailing lists often have issues. I am using Safari 5.0.4 (5533.20.27) on OS-X 10.5 and 5.0.4 (6533.19.4) on OS-X 10.6 and am having problems with the faceted search: a) at bunch of results b) I go to the \"Types\" view, following the link in the top-right, then select \"Distinct Views (Aggregated)\" and odd things happen. Initially it was inserting additional constraints that I had previously added and then removed (e.g. a restriction to owl:Thing), but when I just tried it it swapped out my search term of \"M31\" for \"hello world\". I'm not sure if the URLs are of any use, but for me has the selection Entity1 has any Attribute with Value \"M31\" and if I then select \"Distinct values (Aggregated)\" I get taken to which now says Entity1 has any Attribute with Value \"hello world\" It gets worse: re-starting Safari and entering a new search term of \"Messier Galaxy\" takes me to which has the following two constraints: Entity1 has any Attribute with Value \"Messier Galaxy\" Drop. Entity1 = . Drop Whilst I do appreciate a good game of cricket like the next person, I'm currently more interested in galaxies than batsmen! Thanks from a very-confused Doug uApologies if this is not the right venue for this, pointers much appreciated if this is true. I am using Safari 5.0.4 (5533.20.27) on OS-X 10.5 and 5.0.4 (6533.19.4) on OS-X 10.6 and am having problems with the faceted search: a) at bunch of results b) I go to the \"Types\" view, following the link in the top-right, then select \"Distinct Views (Aggregated)\" and odd things happen. Initially it was inserting additional constraints that I had previously added and then removed (e.g. a restriction to owl:Thing), but when I just tried it it swapped out my search term of \"M31\" for \"hello world\". I'm not sure if the URLs are of any use, but for me has the selection Entity1 has any Attribute with Value \"M31\" and if I then select \"Distinct values (Aggregated)\" I get taken to which now says Entity1 has any Attribute with Value \"hello world\" It gets worse: re-starting Safari and entering a new search term of \"Messier Galaxy\" takes me to which has the following two constraints: Entity1 has any Attribute with Value \"Messier Galaxy\" Drop. Entity1 = . Drop Whilst I do appreciate a good game of cricket like the next person, I'm currently more interested in galaxies than batsmen! Thanks from a very-confused Doug uOn 3/17/11 12:55 PM, Douglas Burke wrote: Here its applying a DISTINCT based Aggregate. 1. 2. > for me has the selection See: > > UI is confusing for a plethora of reasons :-( The sequence is supposed to be as follows: Text Search Tab: 1. Enter a Text Pattern 2. Filter results by Entity Type or other Entity Attributes 3. Keep filtering until your happy across Type or Attributes dimensions 4. Once happy, click on Entity1 or EntityN (depending on how many Entities your interaction brings into scope re. query builder area at the top which also includes a SPARQL link etc) to see Entities that match your quest. Anyway, state what you seek, and I'll try to make a permalink or read:" "Throttled requests" "uHi, I am currently seeing 5s response times from dbpedia from my website's server, with 60ms from my home pc for the same requests. It looks to me like my server is being throttled. Which is possible as I was running some crawling scripts to warm my caches. Rather ironically so I wouldn't need to keep hitting dbpedia. Is there a way to get my server unthrottled whilst appologising profusely? Thanks, Alan Patterson Hi, I am currently seeing 5s response times from dbpedia from my website's server, with 60ms from my home pc for the same requests. It looks to me like my server is being throttled. Which is possible as I was running some crawling scripts to warm my caches. Rather ironically so I wouldn't need to keep hitting dbpedia. Is there a way to get my server unthrottled whilst appologising profusely? Thanks, Alan Patterson" "Problems with config.properties and mapping definitions" "uHi everybody, I tried to set up an extraction framework instance. Unfortunately some undocumented problems occured: 1. When I try to use my own wiki-dump by defining it in config.properties this option seems to be ignored. Even if I set dumpDir=/totalnonsense the framework does not change it's behaviour. Additionally I define updateDumps=false. => How do I recognize if my specific dump is used instead of the one from dbpedia.org? 2. What (and when) exactly will be stored in the output directory defined at config.properties? So far nothing has been stored there, yet. 3. Is it possible to make an own instance of the framework work without using dbpedia.org? I would like to test some own mappings before uploading them to the dbpedia database. Do I need to run an own MediaWiki Server to accomplish this goal or is there any easier way (e.g. by using plain text mapping files and the wiki and ontology dumps)? Thanks for your help, Bastian uHi, 1. When I try to use my own wiki-dump by defining it in the DBpedia dump downloads the latest wiki-dumps from wikipedia and stores it in a specific folder structure under 'dumpDir'. if you want to use your dump you must place it under the proper folder. you could let it download the default and then replace it with your. (remember to enable only the languages you want, otherwise you will download all language wikipedia dumps) stores the extracted triples (according to the extractors you enabled) after you successfully run the dump i am not sure, but i don't think this is possible Cheers, Dimitris Hi, 1. When I try to use my own wiki-dump by defining it in config.properties this option seems to be ignored. Even if I set dumpDir=/totalnonsense the framework does not change it's behaviour. Additionally I define updateDumps=false. => How do I recognize if my specific dump is used instead of the one from dbpedia.org ? the DBpedia dump downloads the latest wiki-dumps from wikipedia and stores it in a specific folder structure under 'dumpDir'. if you want to use your dump you must place it under the proper folder. you could let it download the default and then replace it with your. (remember to enable only the languages you want, otherwise you will download all language wikipedia dumps)  2. What (and when) exactly will be stored in the output directory defined at config.properties? So far nothing has been stored there, yet. stores the extracted triples (according to the extractors you enabled) after you successfully run the dump  3. Is it possible to make an own instance of the framework work without using dbpedia.org ? I would like to test some own mappings before uploading  them to the dbpedia database. Do I need to run an own MediaWiki Server to accomplish this goal  or is there any easier way (e.g. by using plain text mapping files and the wiki and ontology dumps)? i am not sure, but i don't think this is possible Cheers, Dimitris" "relationship between properties of the same object" "uHi, I would like to know i can i understand if string values of two properties of the same object are related. For example, if i get information about a Film, i can list all quotes of the film (quote), and i can list who made a quote for that film. Among the results, for example, i can find that \"Time Magzine\" quoted \"Final Fantasy: The Spirits Within\". Now, how can i find out which quote has been made by \"Time Magazine\"? thanks in advance, Tanya" "WikiCompany alternatives?" "uHi, points to WikiCompany: which is a bad link, HTTP 404. More generally, it appears that WikiCompany had vanished. Are there other datasets of company names? Orgpedia sounds interesting, but nothing to use yet, AFAICT. Is there anything with company data from OpenStreetMap? Or anything with Dun and Bradstreet numbers for US companies? Thanks, Lee" "Let's say goodbye to skos:subject" "uand say welcome to something else? :-) skos:subject doesn't exist anymore in the SKOS Vocabulary [1]. It has been removed since longtime (2005) [2]. They suggest to turn to other vocabularies, e.g., dct:subject [3, 4, 5]. Why don't we listen to this suggestion, and modify the extraction framework (it's just one line of code) so that the next DBpedia release will have the right subject relation to link a resource to a category? If you think that we absolutely need the old property, we could provide triples both with skos:subject and with dct:subject. Existing applications would continue to work and new applications could use the new prop. [1] [2] [3] [4] [5] cheers. uHello Roberto +1 for this. This anomaly has been pointed at again and again. If DBpedia wants to keep up being the showroom of linked data good (best?) practices, it should at least abide by standard vocabularies. Switching from skos:subject to dcterms:subject does not seem so difficult a task, for DBpedia as well as for applications consuming the data. Cheers Bernard 2010/12/10 Roberto Mirizzi < > uConsider it done for version 3.6. :) There will also be other name changes in the next version (this list will be repeated in the release notes): - foaf:givenName replaces foaf:givenname (also deprecated) - - - - - http://dbpedia.org/ontology/wikiPageID replaces http://dbpedia.org/property/pageId - http://dbpedia.org/ontology/wikiPageRevisionID replaces http://dbpedia.org/property/revisionId - http://dbpedia.org/ontology/wikiPageWikiLink replaces http://dbpedia.org/property/wikilink - http://dbpedia.org/ontology/wikiPageExternalLink replaces http://dbpedia.org/property/reference - http://dbpedia.org/ontology/wikiPageRedirects replaces http://dbpedia.org/property/redirect - http://dbpedia.org/ontology/wikiPageDisambiguates replaces http://dbpedia.org/property/disambiguates The /ontology/ namespace contains the high quality data of DBpedia and should always be preferred over the /property/ namespace. The name changes are in the spirit of moving all predicates that come from high quality extractors to the cleaner /ontology/ namespace. Cheers, Max On Fri, Dec 10, 2010 at 5:50 PM, Bernard Vatant < > wrote: uIl 16/12/2010 17:28, Max Jakob ha scritto: GREAT! :-D" "Problem with building extraction-framework on local Ubuntu 12.04LTS" "uHi, I am trying to compile the extraction-framework on my local server, I followed the instructions on site [1], however, when I tried to run \"mvn clean install\", some thing went wrong and the error message shows as below: [INFO] Scanning for projects [INFO] uHi, This seems strange. Can you delete you ~/.m2 contents and try again? Best, Dimtiris On Mon, May 27, 2013 at 7:17 PM, 126 < > wrote: uHi Dimtiris, Thank you for your response, I figured out it myself, the Ubuntu eCryptfs has something to do with this problem, cause I encrypted my home folder while this brings in the limitation of dir path length. When I moved this project out of my home folder and built it, it all works. By the way, I have some problems with the download-config-file and the extraction-config-file, where could I find their specifications ? Thank you! Best regards, cusion From: Dimitris Kontokostas Date: 2013-05-31 16:14 To: xk2008hust CC: dbpedia-discussion Subject: Re: [Dbpedia-discussion] Problem with building extraction-framework on local Ubuntu 12.04LTS Hi, This seems strange. Can you delete you ~/.m2 contents and try again? Best, Dimtiris On Mon, May 27, 2013 at 7:17 PM, 126 < > wrote: Hi, I am trying to compile the extraction-framework on my local server, I followed the instructions on site [1], however, when I tried to run \"mvn clean install\", some thing went wrong and the error message shows as below: [INFO] Scanning for projects [INFO] uOn Fri, May 31, 2013 at 3:09 PM, 126 < > wrote: Inside the dump folder you may find many variations but you can also adjust them to your needs. The files also contain comments for every option but you can also ask if we missed something Dimitris" "Missing mappings by frequency" "uHi. I was thinking that infoboxes probably follow Zipf's law, and made a set of pages for mappings by frequency on my user page: I think the way I did it should only take into account the templates that were missing mappings at the time of the last extraction - I put the quick and dirty script I used on the page too. I used 50 occurrences as the cutoff point, to not have too much noise, and there's no filtering to ensure that the templates are infoboxes, but it should give a rough guide to which templates are the most important to get the maximum amount of data." "HTML abstracts from the sparql endpoint?" "uHello, is it possible to obtain article abstracts in html from the sparql endpoint? This query for the \"School\" article abstract, for instance: SELECT * WHERE { ?hasValue . FILTER ( lang(?hasValue) = \"en\" ) } from which I produce this url: gives the abstract in plain text (within the td tag). Is it possible to receive the html, or is there another service that might be better suited to this, perhaps reading the abstracts directly from Wikipedia? Thanks, Ollie" "RelFinder - Version 1.0 released" "uSteffen Lohmann wrote: Steffen, Very cool! Please add: endpoints. The effect of doing this ups the implications of this tool exponentially! Try it yourself e.g. Google to Apple (use the DBpedia URIs). The density of the graph, the response time provide quite an experience to the user (especially a Linked Data neophyte). Notes about URIBurner: 1. Quad Store is populated progressively with contributions by anyone that uses the service to seek structured descriptions of an HTTP accessible resource via ODE bookmarklets, browser extensions, or SPARQL FROM Clause that references external Data Sources 2. As part of the graph construction process it not only performs RDF model transformation (re. non RDF data sources); it also performs LOD cloud lookups and joins, ditto across 30+ Web 2.0 APIs 3. Anyone with a Virtuoso installation that enables the RDF Mapper VAD (which is how the Sponger Middleware is packaged) ends up with their own personal or service specific variant of URIBurner. Again, great stuff, this tool is going to simplify the message, a lot re., Linked Data and its penchant for serendipitous discovery of relevant things. uHi all, we are happy to announce the release of version 1.0 of the RelFinder. The RelFinder is a tool that extracts and visualizes relationships between given objects in Semantic Web datasets and makes these relationships interactively explorable. It advances the idea of the DBpedia Relationship Finder by offering improved visualization and exploration techniques and working with any dataset that provides SPARQL access. Some key features are: - relationships even between more than two given objects (all visualized in one 'relationship graph') - easy configurability of the accessed dataset and search parameters (via settings menu or config file) - aggregations and global filters (based on relationship length, class, link type, connectivity) - highly interactive visualization (highlighting, red thread, pick&pin;, details on demand, animations) The RelFinder is implemented in Adobe Flex and requires only a Webbrowser with installed Flash Player. Give it a try at have not been aware of before ;-) Thanks go to Jens Lehmann, Jürgen Ziegler, Lena Tetzlaff, Laurent Alquier & Sebastian Hellmann. Best regards, Philipp, Timo & Steffen uThanks Kingsley, I quickly added URIBurner as a dataset but cannot see the added value w.r.t. the RelFinder - your Google-Apple example produces mainly \"seeAlso\" links, which are not that helpful to discover new relationships. Here is the link with the parameters: Once again a demonstration that variety of link types is a big challenge when automatically generating RDF data (link variety is indeed important for the RelFinder). Anyways - thanks for the pointer. We are generally interest in integrating further datasets to the RelFinder's default installation (as long as they produce valuable relationships). Further ideas and links are welcome! Steffen uSteffen Lohmann wrote: There is little more to whats going on with URIBurner. Lets say your understood specific relationships exposed via seeAlso, you could knock a simple RDF document and the have URIBurner consume it (or just SPARQL the resource from its /sparql), then go back to Relfinder to see the effect. There is something that still isn't quite clear about what I am trying to demonstrate re. progressive construction of Linked Data Spaces :-) Static Data is a commodity with diminishing value. We have to be able to move from generality to specificity, in an unobtrusive way re. any Web of Linked Data :-) Kingsley" "DBpedia Lookup downtime - 6th August" "uDear all, There will be a downtime due to maintenance of whole networking system at Free University Berlin tomorrow (6th August) starting at 7am GMT for around 6 hours. This will also affect the DBpedia Lookup Service ( I'll try to re-route the service for that time to another server, but I can't promise this is going to work. I'm very sorry for the inconvenience. Please let me know if this downtime causes a serious problem to a service of you, so that we can find another solution. Best regards, Georgi" "Query not working any more on DBpedia Live" "uHello, The following query is expected to return the abstracts of the articles on Victor Hugo in all languages: select ?abstract where { dbpedia-owl:abstract ?abstract } This query used to work fine on but since October 1st it's started returning an empty result. However it's still working fine on the non-live endpoint ( I know there has been a new version of DBpedia released in September but I couldn't find any information that would explain the problem. Am I missing something? Thanks, uHello, It seems an abstract anymore. I guess the extraction from Romain Beaumont 2014-10-03 10:23 GMT+02:00 Nitsan Seniak < >: uThanks, good point. So I tried with another resource, where the abstract seems to have been extracted this time: select ?abstract where { dbpedia-owl:abstract ?abstract } This time I do get the English abstract as a result, but the language indication seems incorrect. With dbpedia.org, the result abstract string is suffixed with \"@en\", and with suffixed with with \"^^ \". Maybe there's a general problems with abstracts? Thanks, uAll, There is currently a problem with the extraction code that stripped the language from abstracts and a couple of similar tags. We have taken the extraction stream offline while we make a bugfix in the extraction code and perform a hotfix to our database to restore the language tag on these strings. I will advice as soon as these procedures are completed and we can continue the live stream. Patrick" "Empty RDF files" "uHello, correctly and just show \"# Empty TURTLE\". An example of this can be seen in the entry Can this be fixed somehow? Thanks for your help. Artur uHi Artur, This issue has been fixed so you can now get the RDF files and other output serialisations of the data Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 5 Apr 2011, at 14:37, Artur Soler wrote:" "Announcing OpenLink Virtuoso Open-Source Edition v6.0.0" "uHi, OpenLink Software is pleased to announce the official release of Virtuoso Open-Source Edition, Version 6.0.0: New product features include: * ANY ORDER Queries * Anytime Queries (basic and complex business intelligence style analytics queries) * Client-level resource accounting * Expressions in \"IN\" predicate * Faceted Data Exploration Engine & Web Services (REST or SOAP) for high-performance disambiguated entity search & find, across entity type and property dimensions * Inverse Functional Property Value enhanced Identity * Key compression * Transitive subqueries in SQL, SPARQL, and SPASQL (ODBC, JDBC, ADO.NET, OLEDB, XMLA) * Enhanced Sponger Middleware Layer * DBMS hosted Public Key Infrastructure for FOAF+SSL based Federated Identity. Related Links: Virtuoso Open Source Edition: * Home Page: * Download Page: Amazon EC2: * Installation and configuration page * How use pre-configured and pre-loaded Virtuoso instances via publicly shared Elastic Block Storage Devices (e.g. DBpedia, BBC Music & Programmes, etc.) OpenLink Data Spaces: * Home Page: * SPARQL Usage Examples (re. SIOC, FOAF, AtomOWL, SKOS): URIBurner Service: * Home Page: Linked Open Data Cloud Cache: * Home Page: OpenLink Data Explorer (Extensions & Bookmarklets for Browsing & Exploring Linked Data): * Home Page: OpenLink AJAX Toolkit (OAT): * Project Page: * Live Demonstration: http://demo.openlinksw.com/oatdemo * Interactive SPARQL Demo: http://demo.openlinksw.com/isparql/ Faceted Browser: * http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtuosoFacetsWebService Regards, ~Tim" "which folders to index to get full English coverage?" "uHi, In order to get the same triple coverage in English, as the the public SPARQL service: core-i18n/ 30-Mar-2016 12:19 - ext/ 27-Feb-2016 19:31 - fused/ 27-Feb-2016 19:32 - links/ 08-Mar-2016 12:51 - statistics/ 11-Mar-2016 16:40 - 2015-10_dataid_catalog.json 15-Mar-2016 10:51 28K 2015-10_dataid_catalog.ttl 15-Mar-2016 10:51 22K dbpedia_2015-10.nt 23-Feb-2016 16:17 4M dbpedia_2015-10.owl 23-Nov-2015 12:56 2M dbpedia_2015-10.xml 23-Nov-2015 12:56 1M dbtax.csv 14-Mar-2016 17:43 146 lhd.csv 14-Mar-2016 17:42 417 lines-bytes-packed.csv 14-Mar-2016 17:43 654K Hi, In order to get the same triple coverage in English, as the the public SPARQL service: 654K uHi Joakim, the online SPARQL endpoint + the DBpedia ontology dbpedia_2015-10.nt On Thu, Apr 7, 2016 at 3:40 AM, Joakim Soderberg < > wrote: uI am not sure, for example links are missing holding relations to wordnet for example uThe wordnet links are still in the links & core folders like the previous release (and hopefully we will get (a lot) more wordnet links in the next release by Jimmy O'Regan) Best, Dimitris On Thu, Apr 14, 2016 at 4:00 AM, Joakim Soderberg < > wrote: uWhat is the main expected contribution regarding the links to Wordnet? I am planning to use these links to enrich the interface of our Portuguese Wordnet ( Best," "Dbpedia live updates has shrunk" "uDear All, We collect additional entries from dbpedia live from time to time. I noticed the number of entries has shrunk. For example for Actor , we used to get about 40K actors from live updates ( now it is just down to 6500. Another issue regarding entries We used to get < from live Updates. It does not exist anymore, why? Also why previous liveUpdate results were not added to the 2016 release? Thank you very much Dear All, We collect additional entries from dbpedia live from time to time. I noticed the number of entries has shrunk. For example for Actor , we used to get about 40K actors from live updates ( much uThanks a lot Magnus. Have a great weekend On Wed, Mar 22, 2017 at 12:09 PM, Hamid Ghofrani < > wrote: uDear Hamid, What do you mean by „entries“, rdf:type-relationships? This might be because Wikipedia editors tend to use more generic templates, e.g. Person instead of Actor. DBpedia uses these templates to map the resource to a class from the DBpedia ontology. This is truly an issue, a lot of interesting stuff is removed from infoboxes in a concerted way, e.g. notable works, influenced persons, and more. It would be interesting to check, whether (and which) relevant information from past versions could be perpetuated to be continuously part of current releases. I cannot say when this was the case. DBpedia Live and the Releases are not published together and results are not mixed. DBpedia Live extracts the latest state from Wikipedia as soon as the article changes, so there may be little date time differences between the extraction times of individual articles. The releases on the opposite side are all based on a single dump from Wikipedia, i.e. all having the same extraction time. Furthermore, the release cycle allows to do more complex extractions, e.g. internationalization, page links, and post processing steps such as type inference, page rank, etc. These are not feasible with the constant flux date nature of DBpedia Live. Best regards Magnus [1] index.php?title=Tom_Cruise&action;=edit&oldid;=530693566" "dbpedia-live produces unreadable JSON?" "u[ ~]$ python -c \"import jsonlib2; from urllib2 import urlopen; print jsonlib2.read(urlopen(' Traceback (most recent call last): File \" \", line 1, in jsonlib2.ReadError: JSON parsing error at line 23, column 363 (position 6010): Unexpected U+000A ( ) while looking for printable characters. Obolensky on dbpedia.org works just fine, which confuses me as to whether this is a bug against DBpedia or Virtuoso [ ~]$ python -c \"import jsonlib2; from urllib2 import urlopen; print jsonlib2.read(urlopen(' [u' u' u' u' u' Obolensky'] uJoe Presbrey wrote: uI've looked a little closer here and this now looks like a bug in Virtuoso's JSON encoder. The server is sending unescaped newlines in JSON values/objects, eg. ^M instead of \"\n\". From: the only characters allowed unescaped in JSON are: %x20-21 / %x23-5B / %x5D-10FFFF Kingsley, is there a proper route for me to submit and follow this? I'm sure you have other things uHi Joe, We shall look into this issue and report back on this mailing list Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 14 Jun 2010, at 22:10, Joe Presbrey wrote: uJoe Presbrey wrote: Posting here or Virtuoso Open Source support forum is fine :-) Kingsley uAny news on this bug? Still triggers in 2 different JSON modules as of 1 minute ago: $ python -c \"import jsonlib2; from urllib2 import urlopen; print jsonlib2.read(urlopen(' Traceback (most recent call last): File \" \", line 1, in jsonlib2.ReadError: JSON parsing error at line 23, column 363 (position 6010): Unexpected U+000A ( ) while looking for printable characters. $ python -c \"import simplejson; from urllib2 import urlopen; print simplejson.loads(urlopen(' Traceback (most recent call last): File \" \", line 1, in File \"/usr/lib64/python2.6/site-packages/simplejson/init.py\", line 307, in loads return _default_decoder.decode(s) File \"/usr/lib64/python2.6/site-packages/simplejson/decoder.py\", line 335, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File \"/usr/lib64/python2.6/site-packages/simplejson/decoder.py\", line 351, in raw_decode obj, end = self.scan_once(s, idx) ValueError: Invalid control character at: line 23 column 363 (char 6010)" "Collecting Metadata for PublicDomainWorks.net" "uHi all, I write to you on behalf of the Public Domain Working Group of the Open Knowledge Foundation. We are currently working on enhancing and expanding the collection of works available at www.publicdomainworks.net - a registry of artistic works that are in the public domain. In order to do that, we need to gain access to a maximum of information concerning works of different kinds (in particular, the author, the date of birth and death, the date of publication, etc). While we already have a proper database for bibliographic works, we are interested in getting metadata for as many different kinds of works as possible. We are currently trying to figure out where we can find metadata about works other than books or other written publications. In particular, I would like to know whether you do already have a collection of metadata for different kinds of works (images, photographs, paintings, sound recordings, video recordings, etc) and/or who would you suggest we contact in order to obtain the metadata we are looking for. Looking forward to your answer, Primavera Hi all, I write to you on behalf of the Public Domain Working Group of the Open Knowledge Foundation. We are currently working on enhancing and expanding the collection of works available at www.publicdomainworks.net - a registry of artistic works that are in the public domain. In order to do that, we need to gain access to a maximum of information concerning works of different kinds (in particular, the author, the date of birth and death, the date of publication, etc). While we already have a proper database for bibliographic works, we are interested in getting metadata for as many different kinds of works as possible. We are currently trying to figure out where we can find metadata about works other than books or other written publications. In particular, I would like to know whether you do already have a collection of metadata for different kinds of works (images, photographs, paintings, sound recordings, video recordings, etc) and/or who would you suggest we contact in order to obtain the metadata we are looking for. Looking forward to your answer, Primavera" "FW: Editor permissions for dbpedia mappings" "uHi Max,I've sent the email attached below several days ago to the mailing list, but got now answer yet. I know that all of you are doing the job voluntarily and have other things to do, but it is really demotivating, if you want to contribute, but cannot. Is there a chance to get editor permissions any time soon?RegardsRené From: To: Subject: Editor rights for dbpedia mappings Date: Tue, 26 Apr 2011 22:29:37 +0200 Dear dbpedia authors,I'm a full time professor at the University of applied sciences in Hof, Germany. As part of my research on unified information access, I'd like to add an ontology class \"SnookerPlayer\" as a subclass of Athlete and also add an Infobox mapping for the English Wikipedia. It's only a small number of individuals in the according categories, but I'd like to get an understanding of how it works. From my perspective, the best thing is to contribute in order to understand it from inside out.Since there is currently the snooker world championship taking place, I thought I'd start with the category advanceRené uHi Rene, you silently got editor permissions last week, so you should be ready to go to write mappings ;) Cheers, Max On Tue, May 3, 2011 at 10:43, René Peinl < > wrote:" "fixing up the sports classes in the DBpedia ontology" "uThere are about 100 sports-related classes in the DBpedia ontology. Given this number, one might think that quite a bit of attention has been paid to the sports domain in DBpedia, and that it is well organized. However, the opposite is true. There are at least 12 sports-related groupings of classes: Sport itself, SportsLeague, SportsTeam, SportsTeamMember, Athlete, Coach, SportsManager, CareerStation, SportsEvent, SportFacility, SportCompetitionResult, and SportsSeason. There is insufficient organization of these classes. For example, Coach and SportsManager should be related to each other somehow. Sport, SportsLeague, SportsTeam, Athlete, Coach, SportsManager, SportsEvent, SportFacility, and SportsSeason all have subclasses. However, these subclasses are different for each of these classes. For example, there is AutoRacingLeague, SpeedwayLeague, and FormulaOneRacing but only SpeedwayTeam and FormulaOneRacing. Even the generalization structures of these subclasses differs, with GridironFootballPlayer being a generalization of AmericanFootballPlayer and CanadianFootballPlayer, but no gridiron generalization of AmericanFootballLeague and CanadianFootballLeague, or of AmericanFootballTeam and CanadianFootballTeam, or of AmericanFootballCoach. There are also missing property values for many instances of these classes. For example, sports leagues do not have an associated sport. What should be done is a complete rewrite of the sports area. All sports-related classes should be placed in the correct place in the DBpedia ontology. Classes like AmericanFootballPlayer should be defined as the class of athletes who play american football. Mappings then do not have to be to these sorts of classes, but can just assert the general sports-related class and the appropriate property value. This requires the expressive power of OWL, however, and may be too radical of a change. There are other changes that should also be made, such as having a hierarchy of sports, but Wikipedia does not have data to source this hierarchy. These changes requires the use of OWL constructs that have not appeared in the DBpedia ontology to date, however, and this this may be too radical of a change. Perhaps all that can be done is to regularize the sports categories, and ensure that sports-related objects are connected to the correct sport. This would require harmonizing the mappings for the Infobox sports league and other mappings that generate sports-related classes so that they use the same rules for their eventual class and so that they include a link to the sport itself. Comments? Is it possible to do either of these for 3.10? peter" "Odp: probably incorrect mapping to schema.org from MusicalArtist" "uI have a somewhat mixed feeling regarding the organization of Wikipedia. It is true that if you use it as a human, you will find what you need in a reasonable time (usually using search box and following direct category links). But from the POV of a KB engineer, its organization is very far from being perfect. Let me just give you an example: administrative categories in Wikipedia. These are categories that should not be normally displayed to the end-user. You would expect that there is one, maybe two means of expressing that given category is administrative (e.g. a container category and a template). But there are at least several ways of stating that a given category is administrative - sometimes this information is only stated in the contents of the category. Another example - eponymous categories. You have Cat_main template, which is used to state that there is a corresponding article for a given category. You also have Main template, which has similar meaning. But in majority of the situations the link between the cat. and article is only provided as a link at the beginning of the category contents or as the first item on the list of category articles. So I have doubts regarding construction of a well-structured ontology that would emerge e.g. from the usage of infobox templates in Wikipedia. I am not saying that Cyc or Umbel are perfect (they are not), there are also duplications and unnecessary classes there. But Cyc has been constructed for more than 20 years and I believe that many of the problems we are discussing here were already discussed during Cyc creation. Let me just say that you have 3 concepts in Cyc that correspond to \"church\": * #$ChurchService * #$Church-Building * #$Church-LocalCongregation I am not saying that the mapping we will produce in May or June will provide the correct classes in all the cases (this is just not feasible), but I am saying that the necessary concepts are already present in Cyc (and Umbel). And if they are really missing we will provided them and attach to the rich structure of Cyc/Umbel. Kind regards, Aleksander uHi, the kind of ambiguity we are discussing is well known in lexical semantics: it is called “systematic polysemy”. In practice, certain terms are used to express several related meanings. Patterns emerge out of these phenomena, e.g. Place/Building/Community, which the Cyc concepts noticed by Aleksander are an occurrence of. Systematic polysemy was originally described by James Pustejovsky, and studied in detail within WordNet by Paul Buitelaar. Trying to resolve that ambiguity is a typical formal ontology task, but this task is sometimes in conflict with common sense, and especially with linguistic and informal data. Pat Hayes (as far as I remember) used to call the excess of distinction “ontological overcommitment”. In fact there are even cognitive experiments that prove the importance of “controlled” ambiguity for fast and efficient thinking [6]. As an ontology designer that spent quite a time trying to understand how and when it is useful to make distinctions, I would recommend to follow the crowdsourced value of Wikipedia: there is structure there, but often it is not what we would like to find as formal ontologists or logicians. And this is not a “problem\" in my view: we need to understand what is relevant, and to derive requirements from it. If there are structures in Wikipedia that produce exceptions in DBpedia, and derivatively in potential formal orderings of classes and properties, such exceptions should be studied as empirical phenomena, not as “dirtiness” to be corrected. From this perspective, the DBpedia ontology is simply not informed by either criteria: it does not try to be formally correct, but it does not try to follow empirical science rules either. That’s why I was recommending to look at properties and mappings first. As far as Kingsley’s recommendation to “triangulate” with other ontologies, that’s a good one, but it should be accompanied by understanding what is the actual data-oriented ontology of DBpedia-Wikipedia. Let me reference some examples of this empirical research conducted by my lab: [1][2][3], and there is more of course in the literature, e.g. [4][5]. Best Aldo [1] Nuzzolese A.G., Gangemi A., Presutti V. Encyclopedic Knowledge Patterns from Wikipedia Page Links. Proceedings of ISWC2011, the Ninth International Semantic Web Conference, Springer, 2011. [2] Aldo Gangemi, Andrea Giovanni Nuzzolese, Valentina Presutti, Francesco Draicchio, Alberto Musetti and Paolo Ciancarini. Automatic Typing of DBpedia entities. Proceedings of ISWC2012, the Tenth International Semantic Web Conference, LNCS, Springer, 2012. [3] Presutti V., Aroyo L., Adamou A., Schopman B., Gangemi A., Schreiber G. Extracting Core Knowledge from Linked Data. Proceedings of the Second Workshop on Consuming Linked Data, COLD2011, Workshop in conjunction with the 10th International Semantic Web Conference 2011 (ISWC 2011), [4] [5] Johanna Völker, Mathias Niepert. Statistical Schema Induction. In proceeding of: The Semantic Web: Research and Applications - 8th Extended Semantic Web Conference, ESWC 2011 [6] Steven T. Piantadosi, Harry Tily, Edward Gibson. The communicative function of ambiguity in language. Cognition, 2012; 122 (3): 280 On Apr 16, 2014, at 2:26:51 PM , wrote: uHi Peter, I agree with Aleksander about having mixed feelings about the Wikipedia category structure (I also agree with the imperfection of Cyc and UMBEL :) ). Besides the category issues Aleksander notes, there is another source of issues with the Wikipedia structure: compound categories. By \"compound categories\" we mean categories that combine a main subject with additional attributes or characteristics. Some examples for English are shown under [1]. Excluding administrative categories in the English Wikipedia, about half of categories are of this compound type. They often have prepositions, certain articles, or certain heads in the titles. (I will say that some of the compound categories are being converted over time to list categories 'List of XXX', which is another challenge.) These kinds of category problems exhibit themselves in how Wikipedia is used. First, my understanding is that Wikipedia's own structured effort, Wikidata, has chosen not to use the Wikipedia categories or the DBpedia ontology for their organizing structure [2]. Second, I think it can be fairly argued that most users of Wikipedia use it for lookup of specific references and related relationships. I know of very few examples where Wikipedia is used for casual discovery or navigation, uses which would rely on the Wikipedia structure. That you are willing, Peter, to work to improve the DBpedia ontology is fantastic. We ourselves have just spent tens of hours doing a manual mapping of the 650 or so classes in the DBpedia ontology to Cyc and UMBEL. I hope to post a link to that shortly; it may aid in some of your own efforts. Despite my earlier cynicism, I do wish your effort well. We are convinced to our bones the value of the Wikipedia content, unique in human history, and also believe there is much, much latent and decipherable structure within it, including its infoboxes as DBpedia has shown. But we are also convinced that the Wikipedia structure as is is significantly flawed. We'd like to be able to infer and discover across the entire Wikipedia structure, as well as to get to valuable content for specific things. Thanks, Mike [1] [2] On 4/16/2014 6:26 AM, wrote:" "WikiParser is not working but explicit SimpleWikiParser" "uHi, I tried to use the wikiparser and had the following: With val parer = WikiParser.getInstance(\"simple\") and running it with sbt I was told, that WikiParser could not be found. With val parser = new WikiParserWrapper(\"simple\") I had the error: NoSuchFunction for page.format. With val parser = new SimpleWikiParser() everything works fine. Any ideas what's wrong? thx, Karsten Hi, I tried to use the wikiparser and had the following: With val parer = WikiParser.getInstance('simple') and running it with sbt I was told, that WikiParser could not be found. With val parser = new WikiParserWrapper('simple') I had the error: NoSuchFunction for page.format. With val parser = new SimpleWikiParser() everything works fine. Any ideas what's wrong? thx, Karsten uHi Karsten, DBpedia 3.9 used only SimpleWikiParser but is a stable branch. If you just want SimpleWikiParser it should be fine calling it directly You can switch to master branch (4.0-SNAPSHOT) where we (re)support WikiParser.getInstance(\"simple\") but might get into some bumps ;) Best, Dimitris On Wed, Dec 4, 2013 at 5:25 PM, Karsten Jeschkies < > wrote: uThx for the quick answer. SimpleWikiParser is fine. I just noticed that it does not remove '' or '''. Is that on purpose? thx, Karsten On 4 December 2013 17:01, Dimitris Kontokostas < > wrote:" "Updating local sesame database with Live dumps" "uHi All, I have local sesame repository. I want to update this database repeatedly by syncing DBpedia provided live dumps.I am trying to write a program using open-sesame Java API. If we check the live dump , suppose for today ( that there are multiple added/removed files available for the same timestamps.Like 25-Feb-2013 12:00 timestamps there are around 16 files.So in which sequence I need to import the file so that my local data will not be inconsistence. Also for inserting data , if I will directly import this dump to repository than it will create duplicate records rather than updating records.I am under such impression that there is no function available for bulk update/delete using the live update dump. Hence I am thinking to follow below approach. \"Process the DBpedia update live dump line by line and insert/delete it into/from database using INSERT DATA or DELETE DATA queries.\" Please correct me if I am wrong. uHi Gaurav, On 02/25/2013 02:11 PM, gaurav pant wrote: Those are not the dump files, you can find the dump files at [1]. We create those dump files on monthly basis, and the users can use the latest one as the starting point of synchronization. The files under [2] are the changeset files, i.e. the diff files resulting from comparing the newly extracted triples with the existing ones. Those files are the ones used for synchronization. The sync tool [3], we have developed, continuously downloads those changeset files and use them to synchronize a DBpedia-Live mirror with the official live endpoint. The sync tool mainly works with Virtuoso [4], but since it's open source you can easily adapt it to work with Sesame if necessary. You can find more information about DBpedia-Live at [5]. uHi , Thanks Morsey I have created a Sesame Repository using \"openrdf-sesame-2.6.10 \" Find below sample code. import org.openrdf.model.Value; import org.openrdf.query.BindingSet; import org.openrdf.query.QueryLanguage; import org.openrdf.query.TupleQuery; import org.openrdf.query.TupleQueryResult; import org.openrdf.repository.Repository; import org.openrdf.repository.RepositoryConnection; import org.openrdf.repository.sail.SailRepository; import org.openrdf.sail.nativerdf.NativeStore; import org.openrdf.sail.inferencer.fc.ForwardChainingRDFSInferencer; import org.openrdf.rio.RDFFormat; import java.io.File; public class SesameRepository { / * @param args */ public static void main(String[] args) { // TODO Auto-generated method stub File dataDir = new File(\"/home/gaurav/My_Work/RM_req/19455/SesameAPI/MyRdfFile\"); Repository myRepository = new SailRepository( new ForwardChainingRDFSInferencer(new NativeStore(dataDir))); try{ myRepository.initialize(); File file = new File(\"/home/gaurav/My_Work/RM_req/19455/Steve-martin.nt\"); //taken for sample RepositoryConnection con = myRepository.getConnection(); String baseURI = \" con.add(file, baseURI, RDFFormat.NTRIPLES); String queryString = \"SELECT x, y FROM {x} p {y}\"; TupleQuery tupleQuery = con.prepareTupleQuery(QueryLanguage.SERQL, queryString); TupleQueryResult result = tupleQuery.evaluate(); while (result.hasNext()) { BindingSet bindingSet = result.next(); Value valuex = bindingSet.getValue(\"x\"); Value valuey = bindingSet.getValue(\"y\"); System.out.println(valuex+\"\"+valuey); } } catch(Exception e) { System.out.println(\"Some Exception Occured\n\"); } } } Now I want to update my repository (insert some more records/Delete some records) than can anyone please let me know how can i do it with handling all the possible error/exception situations. I do not require exact code but the class/method name for the same. Thanks! On Mon, Feb 25, 2013 at 7:01 PM, Mohamed Morsey < > wrote: uHi Gaurav, On 02/26/2013 10:03 AM, gaurav pant wrote: I'm not so experienced with Sesame, but if I'm not mistaken this thread might help [1]. [1] repository-connection-rollback-does-not-to-work-using-sparql-queries uHi Morsey, I was thinking little bit wrong. Please correct me if I am wrong about dbpedia live. \"There is no concept of updating existing record actually .It is being achieved with delete-insert operation sequence. Suppose I updates the author of a particular book named \"Mybook\" from XYZ to ABC than DBpedia live will provide two records as below. \"XYZ\" as *delete record* and \"ABC\" as *insert record* Please correct me if I am wrong. The dump On Tue, Feb 26, 2013 at 3:08 PM, Mohamed Morsey < > wrote: uHi Gaurav, On 02/26/2013 11:29 AM, gaurav pant wrote: Yes that's right." "Mapping extractor generates only 1 triple when a property has multiple objects" "uHi Jona, We have just generated fresh dumps for the Italian DBpedia with the latest extractors code version and found that some data is lost in the mapping-based dataset. If you have a look at this example [1], 'dbprop-it:genere' property has 4 objects, while 'dbpedia-owl:genre' only has 1. Same thing for 'dbprop-it:nome', which maps to 'foaf:name'. I checked the same resource in the English version [2] (the property is dbprop:genre) and the data is there. Is it a mapping extractor bug? Cheers, Marco [1] [2] Glenn_Danzig uHi Marco, yes, this is a bug. I don't know what's going on. genre = [[Heavy metal music|Heavy metal]], [[blues rock]], [[horror punk]], [[deathrock]], [[Classical music|classical]] All of them are extracted. |genere = Heavy Metal |genere2 = Alternative Metal |genere3 = Punk rock |genere4 = Hardcore punk But only Punk rock is extracted: Works for me - the sample extraction page contains http://xmlns.com/foaf/0.1/surname all'anagrafe Glenn Allen Anzalone But that 'genere' thing is strange. Maybe template properties that end with numbers are not mapped correctly? I looked through the code but didn't find an obvious problem. We'll have to start a debugger, I guess. Cheers, JC On Mon, May 7, 2012 at 12:58 PM, Marco Fossati < > wrote: uHi, I am pasting an old developers-list thread between me and Max (I could not find it on the archive search for a link) I think it is more or less about the same bug. I don't remember if it was fixed or not Cheers, Dimitris uNo, this is a different problem. The wikitext already contains four different properties: |genere = Heavy Metal |genere2 = Alternative Metal |genere3 = Punk rock |genere4 = Hardcore punk And contains mappings for all of them. The weird thing is that some of these numbered properties are extracted, other's aren't. Here's an example where four out of six are extracted: What's even more weird is that the Wikipedia article contains \"Folk\", but our framework extracts \"Folk music\". Excuse me, I have an urgent appointment with my psychiatrist. :-) On Tue, May 8, 2012 at 6:15 PM, Dimitris Kontokostas < > wrote: uNo, I'm not insane. It's not a bug, it's a feature. Let me explain. The ontology property 'genre' is an object property, so its values must be URIs and its parser is looking for links to other Wikipedia pages. In this case, the values are rendered as links in Wikipedia, but not entered as links. There is genere = Heavy Metal but not genere = [[Heavy Metal]] Even though it doesn't find a link, the object parser could simply generate a URI from the string \"Heavy Metal\", and in the case of this template property, it would be correct, but in many other cases, it would be wrong. As a compromise, we implemented a heuristic: If anywhere on the page there is a link whose label or target (I'm not sure about the details) matches the string, we generate a URI. If the page does not contain such a link, we generate nothing. Example: [[Punk rock]], but no links for the other three genres. That's why a triple is generated for \"genere3 = Punk rock\" but not for the other template properties. Even worse - there is a link to [[Heavy metal]], but in the infobox it's spelled \"Heavy Metal\" (with a capital M), so we don't find that link. This behavior could be considered a bug. Wikipedia somehow fixes the uppercase. The page about Johnny Cash contains a link [[folk music|folk]], and apparently that's why we extract As usual, when computers try to be smart, they tend to confuse us humans. :-) Cheers, JC On Tue, May 8, 2012 at 6:53 PM, Jona Christopher Sahnwaldt < > wrote: uHere's the heuristic code: On Tue, May 8, 2012 at 7:14 PM, Jona Christopher Sahnwaldt < > wrote: u2012/5/8 Jona Christopher Sahnwaldt < >: Fixing this case bug would fix the missing property we are seeing in the Italian DBpedia, right? Could it be done? uSorry :) I got confused and thought we were talking about the */property/* namespace Cheers, Dimitris On Tue, May 8, 2012 at 7:53 PM, Jona Christopher Sahnwaldt < >wrote: uHi Jona, Thanks for the exhaustive explanation. On 5/8/12 7:14 PM, Jona Christopher Sahnwaldt wrote: Would it be possible to emulate Wikipedia renderer engine behavior? It is written in PHP, so it should be a piece of cake to implement it in powerful Scala. Please let us know, as it is quite a critical issue for us. Cheers, Marco uHi Marco, This smells like another thread: Would it be possible to emulate Wikipedia renderer engine behavior? It Soare you volunteering to try it out? Cheers, Pablo On Thu, May 10, 2012 at 11:31 AM, Marco Fossati < >wrote: uHi Marco, sorry for the exhausting explanation. :-) After I had sent the mail I thought I should have written half as much As for the rendering - Wow, sweble looks pretty good. I'm envious. :-) Trying to parse a large article like enwiki/United_States gives quite a few warning messages though. [1] I would guess that adding template expansion to DBpedia is a *major* task. May take several months. It would also be a *huge* benefit. :-) Yes, the AbstractExtractor uses HTTP to call PHP. In 2009 or so, we tried pretty hard to get MediaWiki to run outside of a web server, but in the end we gave up. There are too many places in the code that are tied to server APIs. We also ran a few small benchmarks that strongly indicated that HTTP is not the problem. IIRC, the HTTP overhead only took about 10% of the whole rendering time. But managing to render wikitext in Java/Scala instead of PHP would probably be a large performance boost. Cheers, JC [1] On Thu, May 10, 2012 at 11:56 AM, Pablo Mendes < > wrote: uI think he just missed to add an appropriate emoticon at the end of the sentence :-) u2012/5/10 Pablo Mendes < >: This is not similar, In this thread I was asking about a couple of bugs of misconfiguration of dbpedia html page renderer written in VSP. Here Marco Fossati is asking about having more triples for our use case (the music 'genre' of pages like Glenn Danzig), where it is just the case of the template string that let the code miss to extract the other triples; this is regarding the scala extraction framework and not the description.vsp from dbpedia VAD. uOn 5/10/12 12:24 PM, Jona Christopher Sahnwaldt wrote: I meant exhaustive. :-) u2012/5/10 Jona Christopher Sahnwaldt < >: Sure it is, but here Marco Fossati was just asking about case insensitive wikilink discovery in order to have all music genres in Glenn Danzig page extraction 'as php of wikipedia seems to do' :-) You mentioned a way to get this fixed in the code, you plan to have this integrated in mercurial or shall I try to provide a patch for this ? uMarco, My pointer to the thread was with reference to the Sweble parser usage. Rendering pages \"like the PHP of wikipedia seems to do\" means just using their code or implementing template resolution. We've been doing the first, in that thread we suggested doing the second. Fixing this in the code has to be done carefully. The coreference resolution functionality is nice. In my opinion it should not be deactivated. When there is a link, then the object property can be generated directly. If there isn't a link, I like the idea to try to find one in the page. Also, there is some transitive closure over redirects happening down the road, and this can fix links to redirect pages. All of this is great in my opinion. I'm just reiterating this, because I haven't managed to calmly reread the whole thread, so I don't really know precisely what we're trying to fix here. Please pardon my lack of attention here. Cheers, Pablo On Thu, May 10, 2012 at 1:51 PM, Marco Amadori < >wrote: uI know. :-) On Thu, May 10, 2012 at 1:50 PM, Marco Fossati < > wrote: uI just made that little change. I had looked at the code before, so it was very simple. We now also get the triple for \"Heavy metal\", but that's it: Let's hope that this doesn't introduce too many extraction errors. It's quite unlikely though. I looked around a little and finally found a case that we would treat differently after this this change. Potentially wrong, but it's highly unlikely. The pros will certainly outweigh the cons. On Thu, May 10, 2012 at 1:51 PM, Marco Amadori < > wrote: u2012/5/10 Jona Christopher Sahnwaldt < >: It seems a good news, but Nice example on why case insensitiveness is bad. :-) I think I'm changing my mind about that issue. The code should not try to be smart at all. My question is, if the mapping is hand made and trusted over the page, why in the code we trust the page (links) more than the mapping? E.g. in the Artist mapping if I says that 'genere' is genre in the dbpedia owl, I should be trusted more than the page which could not have the links. Shouldn't the mapping be the king or this could be wrong in other ways? uHi Marco, Jona, Shouldn't the mapping be the king or this could be wrong in other ways? Absolutely!!! User-generated content is our \"business\". Let me repeat, with added emphasis: \"When there is a link, then the object property MUST be generated directly FROM THE LINK PROVIDED BY THE USER. If there isn't a link FOR AN OBJECT PROPERTY, then we should try to find one in the page.\" More details Case 1: a link is provided as value. Solution: keep it as is Case insensitivity in URIs is nonsense. We should treat URIs as if they were numbers or any other kind of opaque id. Even if in Wikipedia we know that they play around with \"readable IDs\". Case 2: a link is not found, only a string. Now, for strings, case insensitivity starts to make sense. But we should be aware that it can cause errors. Perhaps having a separate dataset for \"guessed property values\" would be the safest. One exception where tampering with user provided values is acceptable: when other user-provided value corrects it. That's the case for redirects for example. So if the value of a property is a redirect page, we should point it to the end of the redirect chain. That's my opinion. Cheers, Pablo On Thu, May 10, 2012 at 5:45 PM, Marco Amadori < >wrote: uI think what Marco meant was: the mapping says it's an object property, so we should extract a URI, even if the property value is just a string. In the case of the musician infoboxes on it wiki, that would work, but in many other cases, it wouldn't. For example: Evilive\". Just that string, no links. We could use a heuristic to split the string into multiple links etc, but I don't think there's a good, clean solution. With a naive approach we would extract , which would be wrong. There is a simple rule though: If the Wikipedia template renders strings as links, then we should extract strings as URIs. Otherwise we shouldn't. The problem is that our code can't find out what the template does (well, it could, but that would be almost as hard as rendering templates). But humans can. So to implement that rule, we have to add a feature to the mappings wiki, as I described in my previous mail, so users can add a flag saying \"yes, plain string values in this property should be extracted as URIs\". It seems that Italian Wikipedia templates often work like this, while English templates rarely do. To make that behavior possible, the Italian templates use multiple properties like genere, genere2, genere3 etc, while the English templates use one property which the editors can fill with links or strings as they like. Cheers, JC On Thu, May 10, 2012 at 6:18 PM, Pablo Mendes < > wrote: uOh, I don't see a problem here. There would only be a problem if we extracted a page that contained the string \"NeoCon\" in the infobox but a link to [[Neocon]] in the text (or vice versa), and that's rather improbable. Even if something like that *does* happen I guess it's more likely to be a typo (that we now automagically fix) than a deliberate, meaningful spelling difference. On Thu, May 10, 2012 at 5:45 PM, Marco Amadori < > wrote: uOn Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote: Right. We should use the same euristic used by the wikimedia template engine, that way it would match with the proper wikipedia page. But that isn't implicit if the mapping creator maps it to ObjectProperty ? This again means to me that we should trust mappings. uIn the context of Jona said Sorry, I don't understand this. When I said we can *find* a link, I didn't mean that we should *invent* one. I really mean going into the text and finding if a URI with an exact match in the anchor text is linked from the page, much like it seems that the current coreference code does. Or did I misunderstand something? cheers Pablo On Thu, May 10, 2012 at 8:11 PM, Jona Christopher Sahnwaldt < uPablo, I wrote that we could use a heuristic, but I also meant that we should not do this. Apparently the rest of my sentence was not clear enough. :-) Cheers, JC On Fri, May 11, 2012 at 1:43 AM, Pablo Mendes < > wrote: uOn Thu, May 10, 2012 at 9:09 PM, Marco Amadori < > wrote: At first glance, that may look like a nice idea, but it would (very likely) mean that DBpedia would extract many additional URIs that are wrong and only a few additional URIs that are correct. Slightly better recall, much worse precision. I should add that that's a (strong) hunch based on my experience with DBpedia extractions and a few clicks in Wikipedia. I don't have actual data to back this claim. The Wikimedia template engine in general does not use heuristics. The specific template use heuristics, but pretty simple rules: whatever value Wikipedia users enter for one of the 'genere' properties is wrapped in '[[' and ']]', and thus rendered as a link. That's why it would make sense to allow users to add a special flag to a property mapping. Our framework should always extract the wikitext string value for one of the 'genere' properties as a RDF URI, not as a RDF literal. But for most other properties, such behavior would lead to wrong URIs. Even if the template property maps to an object property. No, see above. But only if the mapping explicitly states that this property should always be extracted as a URI. The editor of the mapping should check the source code of the Wikipedia template. If the template ALWAYS renders a property value as a link, then we should ALWAYS extract URIs for the property values. If not, then we should ONLY extract a URI for the property value if we can find a matching link somewhere on the page (that's the heuristic we use now). One more thing that may be relevant for this discussion: In the case of Template:Artista_musicale, things are even more intricate. Template:Artista_musicale calls property. Template:Autocat_musica contains a long list of musical genres and slightly different names that may be used for them, for example: |acidjazz |acid-jazz |acid jazz=[[Acid jazz]][[Categoria:{{{tipo}}} acid jazz]] This means that if a Wikipedia page contains \"genere=acidjazz\", the rendered HTML will contain a link to \"Acid jazz\". DBpedia won't be so smart. Even if we extend the framework with that special \"always extract this property as URI\" flag, DBpedia would extract the URI pretty useless, since there is no redirect from But that's a minor problem. I still think that adding that flag would be a good thing. Cheers, JC uOn Sunday 13 May 2012 21:04:05 Jona Christopher Sahnwaldt wrote: wrote: In my requirements, this will happen if and only if the same happens in mediawiki code, or in other words the DBpedia heuristic is the same as Mediawiki's one. If the mediawiki template engine would produce a wikilink we should do it, otherwise we do not. There is no '[[', ']]' in the page, so who is adding them in 'genere' properties? If anyone does it, this is 'my' heuristic. Ok. In less words, it isn't true that we should do the same as wikipedia does? This could be a nice wanted feature for DBpedia codebase. If it is the simplest thing to do, go for it. uOn Sun, May 13, 2012 at 9:13 PM, Marco Amadori < > wrote: I agree. It's not really MediaWiki ot the template engine in general though, but specific templates. See below. others, I didn't dig deep enough to be sure). For example, this snippet from the source of Template:Artista_musicale: {{Autocat musica|genere={{{genere|}}}| passes the value of 'genere' on to Template:Autocat_musica, and Template:Autocat_musica contains [[{{{genere|}}}]] which wraps the value of genere in '[[' and ']]'. BUT if the template contains a property 'categorizza per genere = no', then different code in Template:Artista_musicale applies: {{#ifexist:{{{genere|}}}|[[{{{genere|}}}]]|{{{genere|}}}}} handles the value of the 'genere' property. It first looks for a page with that title. If auch a page exists, it renders a link. If such a page does not exist, it renders just the text. The treatment of the other 'genere2' etc properties is a bit different: {{#if:{{{genere2|}}}| [[{{{genere2}}}]]{{{nota genere2|}}}|}} If there is a value for genere2, it is rendered as a line break and a link, appending the text of an optional note. If there is no value for genere2, nothing is rendered. So, to accurately mimic the behavior of these templates the DBpedia extraction would have to do this: if ($value of property 'genere' is not empty) then if (categorizza per genere = no) then if (page with title $value exists) then extract $value as URI else extract $value as string // [1] end if else // normalize $value, e.g. 'acidjazz' to 'Acid jazz' extract normalized $value as URI end if end if if ($value of property 'genere2' is not empty) then if (categorizza per genere = no) then extract $value as URI else // normalize $value, e.g. 'acidjazz' to 'Acid jazz' extract normalized $value as URI end if end if the other genere properties are like genere2 [1] (or rather, don't extract anything, because a string is not a valid value for an obect property) There are probably other cases that I have missed Wikipedia templates are a mess:-) We SHOULD do the same as Wikipedia. Or rather, we should try to get close to it with reasonable effort. Well, I shouldn't have written always. More precisely, balancing effort and extraction precision: If the template MOST OF THE TIME renders a property value as a link, then we should ALWAYS extract URIs for the property values. Yes, it would be nice to have a feature that allows us to add such mappings to the extraction configuration. But I think the specific mappings should not be in the code, but on the mappings wiki. That of course means that the mappings wiki needs a lot of new features, probably new namespaces, etc. I'm afraid I won't even have the time to implement this anytime soon. I'm already way behind schedule, and we really need to get DBpedia 3.8 out. Cheers, JC uOn Monday 14 May 2012 00:08:02 Jona Christopher Sahnwaldt wrote: Many thanks for the explanation, now I'm getting the whole picture. So the only sane way without a templating engine capable of digesting Wikipedia's template in DBpedia, seems to me being the Mapping flag. Let's go for it? uI propose the following algorithm: Let's keep the current codebase that produces a triple if and only if a same cased wikilink is present elsewhere in the page. This time it does not trash the triple if it does not find the link, instead the code will put the triple in another file, named like 'mapping_based_properties_ambiguous_part1' which will be analized after the end of extraction, with the help of the redirects map (which is available at the end of the whole wikipedia extration phase). This way if a property exist in the redirects map if would be moved (after being redirected) in a file named 'mapping_based_properties_disambigued', in the same file it would be placed also if a the property exists in a case insensitive form in the redirects map without ambiguity (so not Neocon vs NeoCon in this file). Finally if the property has still ambiguity we will leave it the final '.*_ambiguous' file. If this file is zero sized for all wikipedias, I would like to have a beer* :-) This way we let the DBpedia maintainer to choose or not to have this data inside his/her DBpedia instance or to pass those triples through the Mediawiki APIs, some Machine Learning tool or Amazon's Mechanical Turk in order to verify them :-) *probably also if it is not uOn 5/15/12 11:59 AM, Marco Amadori wrote: +1 u+1 on the proposed solution +1 on the beer* Also, -ambiguous can contain triples where the property is defined as ObjectProperty and the value is a String. I've heard that there is a tool out there called DBpedia Spotlight (or something like that) that could be used to disambiguate these links automatically. :) Now we're talking paper material. Who's in? Cheers Pablo On May 15, 2012 11:59 AM, \"Marco Amadori\" < > wrote: u2012/5/15 Pablo Mendes < >: Wait, the algorithm above mentioned is exactly about that: -ambiguous containst triples for \"Strings for ObjectProperties\" where there is no link in the page and where there is not a single redirect. Another step file with '-spotlighted' could be surely done! ? I'm not that kind of guy, I'm talking about code, not papers :-) uOh, ok. Sorry. Should stop answering e-mails while walking to the train. :) Alright. Marco, so will you give a go on the -ambiguous creation? Based on that I can try the -spotlight file. Cheers, Pablo On Tue, May 15, 2012 at 12:29 PM, Marco Amadori < >wrote: u2012/5/15 Pablo Mendes < >: I was hoping that Jona could take on that, he knows an order of magnitude better than I do the DBpedia extractor codebase. It could take ages for me to do that since I would need first to study the code (and scala too). Cool! uAnd here is another open issue. Cheers, Marco |genere = Heavy Metal |genere2 = Alternative Metal |genere3 = Punk rock |genere4 = Hardcore punk all properties map to the same dbpedia-owl:genre property, but only 2 are extracted i.e. 'Punk_rock' and 'Heavy_Metal' See thread: Vasx1ELpxVc" "DBpedia Search (was The Semantic Web becoming real)" "uHi guys, I think the DBpedia Search interface provides a nice complement to the regular wikipedia web interface. Sometimes it's much faster than wikipedia, and I do like the tag cloud result refinement (although I think a little \"click on a tag to refine your results\" message above the tag cloud would help people understand what the tag cloud is for). One thing I'd really like to see is links back from the html results pages to the underlying RDF. Maybe this exists already, but I can't see it anywhere. tags in the HTML and little RDF icons that did this would close the loop nicely, provide crawlable RDF, and help get the DBpedia RDF out in the world rather than just relying on the SPARQL endpoint. Am I making any sense? Tom. PS. Is it time to do conneg on DBpedia URIs? uHi Tom, :) Alright, I'll add this. Very good point! The links don't exist at the moment. Simply because I forgot them. They'll be added tomorrow. I think we do have content negotiation on the URIs. Don't we? Cheers, Georgi uGeorgi Kobilarov wrote: Georgi, Have you looked at how we handle URI dereferencing in our RDF Browser or our Dynamic Data Pages demos? Examples: 1. Entrepreneurs via RDF Browser 2. TimBL via Dynamic Data Page 3. Just enter The DBpedia search effort can exhibit these characteristics by exploiting the enhanced anchor tag (what I call ++ internally) feature of OAT (that's what we do re. 1&2). Kingsley" "is it possible to extract internal links and available languages with a query?" "uHi, Is it possible to extract 1. internal links (i.e. links between different Wikipedia articles) and 2. which languages are available for an article (e.g. English, German, Italian) with a query? The content of this e-mail (including any attachments hereto) is confidential and contains proprietary information of the sender. This e-mail is intended only for the use of the individual and entities listed above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication (and any information or attachment contained herein) is strictly prohibited. If you have received this communication in error, please notify us immediately by telephone or by e-mail and delete the message (including the attachments hereto) from your computer. Thank you!" "Geographic Coordinates of Places" "uHello Everybody I have a couple of questions and probably you have the correct answer that I can't find. 1) I'm retrieving city points using the SPARQL endpoint and I find that there are various predicates for the geographic coordinates: geo:lat , geo:long, geo:POINT, grs:point and geo:geometry For my convenience I used geo:geometry WKT format, but I realized that not all the resources have this predicate ( not have it while It poses me the question, Which of those predicates is present in all the resources? 2) This question comes from a problem that I have with some points in DBpedia. For example Allueva (Spain). Are the coordinates extracted from Wikipedia geodata or from another service? Because I see that the properties are properly parsed dbpedia:latd, dbpedia:latm, dbpedia:latns etc but the geo:geometry has a different value. Thanks for your time! and keep a good work :-) Hello Everybody I have a couple of questions and probably you have the correct answer that I can't find. 1) I'm retrieving city points using the SPARQL endpoint and I find that there are various predicates for the geographic coordinates: geo:lat , geo:long, geo:POINT, grs:point and geo:geometry For my convenience I used geo:geometry WKT format, but I realized that not all the resources have this predicate ( service? Because I see that the properties are properly parsed dbpedia:latd, dbpedia:latm, dbpedia:latns etc but the geo:geometry has a different value. Thanks for your time! and keep a good work :-) uHi Jordi, I guess the most uses are geo:lat, geo:long and geo:POINT. At least the first two predicates, because they come from the W3C Geo onto [1] and is one of the most used in GeoData [2]. WKT is a serialization as well as KML, GeoJSON , etc> 2) Here, I could only point you to other initiatives/ services dealing with GeoData: linkedGeoData [3], GADM [4] and NUTS [5] with mappings to DBpedia I hope it helps Best, Ghislain [1] [2] [3] [4] [5] uThanks. Your answers and links helped a lot. But, It would be nice to review how DBpedia obtains the geo:lat,geo:lon and geo:geometry because there are resources with incoherent data (geo:lat has an apparently incorrect value while the dbpedia extractions from Wikipedia are correct). If those values come from the Wikipedia dumps seems like the geo-extractor is failing in some cases, obtaining incorrect geo:lat geo:lon values. Salut! On Wed, May 30, 2012 at 1:17 PM, < > wrote:" "Introducing the Ontology2 Edition of DBpedia 2016-04" "uWe are proud to announce the Ontology2 Edition of DBpedia on the AWS Marketplace, available at this product is a combination of Ubuntu Linux, OpenLink Virtuoso Open Source Edition and data from DBpedia 2016-04 with carefully chosen hardware, constructed with an advanced automated packaging system and tuned for reliability, high performance, and the ability to execute difficult queries. Not everyone has the powerful hardware required to do SPARQL queries against DBpedia. We’ve applied more than two years of experience packaging RDF data for the AWS Marketplace to make a product that levels the playing field to enable you do to powerful SPARQL 1.1 queries over the complete English language DBpedia with one click deployment and pricing that scales with your needs. With 168% more facts than the public DBpedia endpoint and more 73% more than our last version, the Ontology2 Edition of DBpedia 2016-04 offers a full fat, full throttle experience that is satisfying for academic, commercial and other uses – despite this considerable expansion, we’ve reduced hardware requirements and pricing so that our 2016-04 edition costs from 25% to 50% less than our 2015-10 edition. What is DBpedia good for? DBpedia is a collection of facts extracted from the English language and other editions of Wikipedia and features wide-spectrum coverage of most topics that are widely known, such as persons, places, historical events, chemical compounds, products, and abstract concepts. DBpedia concepts intersect strongly with most vertical domains such as Finance, Health Care, Geospatial, Ski Areas, etc. Frequently additional work is required to make a fully functional data set relevant to a specific domain, yet, DBpedia can be the basis of a “first draft” database on any almost any topic. Beyond that, DBpedia contains valuable enrichment information and can be used as a Rosetta stone between competitive and cooperative ontologies and databases. One of the most valuable forms of enrichment DBpedia can provide is multilingual information, thus DBpedia has special value for those who develop applications for the EMEA (Europe, Middle East, and Africa) market where the existence of many languages poses a challenge for education, commerce and peace. Although DBpedia lends most naturally to a database, logical, or rule-based approach, the correspondence between a large database of facts and supporting text makes DBpedia a key resource for text understanding work using the machine learning methodologies that are currently popular. How does the Ontology2 Edition of DBpedia 2016-04 compare with prior versions? The Ontology2 Edition of DBpedia 2016-04 contains numerous improvements over the Ontology2 Edition of DBpedia 2015-10 and previous releases. First, the Ontology2 Edition of DBpedia 2016-04 is our best edition of DBpedia yet because DBpedia 2016-04 is the best DBpedia yet. New data sets open the way to new applications and improvements to the extraction frameworks including machine learning mechanisms improve quality in general. Compared to previous editions, our 2016-04 edition features optimized I/O and networking the AWS cloud. To avoid slow initial speed while the EBS image is loading from the snapshot, we force the EBS image to be initialized as fast as possible and only give you access when it is ready to deliver fast and predictable I/O. For the first boot, you’ll need to wait about 90 minutes for it to be ready, but it is worth the wait because you’ll get consistently strong performance out of the gate – assisted by numerous changes to the configuration and build process that stabilize the system, even when tackling the toughest queries. The 2016-04 edition is our first edition to use named graphs to isolate and identify 71 different data sets provided by the DBpedia Foundation. You query the union of these graphs by default, so It works like it always has, but you can also pick and choose which datasets to use for which triple patterns so you can pick between multiple points of view of facts and look at the relationships between various points of view. We’ve improved our pricing model to be a better fit for more users and align our interests with your own. With a choice between of a low hourly rate of $1.66 USD an hour inclusive of hardware and a $499/year subscription, the Ontology2 Edition of DBpedia 2016-04 is not just the fastest, but also the least cost solution for almost anyone who wishes to perform heavy SPARQL queries over DBpedia. How does the Ontology2 Edition of DBpedia 2016-04 compare to other options? Two years ago, our :BaseKB product was the first linked data machine image to be offered in a public cloud marketplace. Other brands have come and gone, but we’ve produced more machine images with a more diverse range of different data products than anyone else. We’re not funded by a research grant, triple store vendor, or cloud service provider, and we use our machine images for our own work, so we’re focused entirely on the needs of people who query or otherwise consume Linked Data. Speed of execution is critical in the world of corporations, startups, and publish-or-perish academia and the Ontology2 Edition of DBpedia 2016-04 delivers. It frees you to focus on your own unique contributions without the distraction of provisioning hardware and working with triple stores at the edge of the performance envelope. Act Now Subscribe to the Ontology2 Edition of DBpedia 2016-04 in the AWS Marketplace and you could be getting results in as little as two hours. The Ontology2 Edition of DBpedia 2016-04 is available in all current and future availability zones in the world’s most popular cloud services provider. With pay-as-you-go pricing, the Ontology2 Edition of DBpedia 2016-04 delivers optimized hardware and software when you need it – and without any commitment, there is no reason not to subscribe today. uHi Paul, you always send interesting postings about this stuff, Linked Data or whathever it is named now But i have had nothing to do with AWS Marketplaces until now, therefore my stupid question, sorry: When i subscribe your 'The Ontology2 Edition of DBpedia 2016-04' a.) can i give to the world then directly a SPARQL-endpoint-link (with 'full throttle experience') for your dbpedia-dataset, so that everybody can check its performance like we do it since a while ('throttled performance') for example with common endpoint: or b.) can only i or my app (logged in) query the dataset? i have some apps to check such things, but i have to decide quickly with a model case, is it worth to invest time? i hope for a simple, introducing respond for interested with 'no AWS-experience' but very interested to check the performance easily reproducible for all Thanks, baran. uThis should be easy to do. By default, port 8890 is open to the world when the image boots up, so from the AWS side there is no problem, it is just a matter of enabling the public endpoint in Virtuoso, or alternately creating new credentials for people to log in. One issue is that with the throttles off, you can ask questions that take 20+ minutes to answer uOn Wed, 02 Nov 2016 14:23:19 +0100, Paul Houle < > wrote: uOn Wed, Nov 2, 2016 at 4:28 PM, < > wrote: uExactly, in evaluating the O2 Edition, the economics comes first, like it or not. For research and development purposes you are going to put severe stress on a triple store. You can always think of some query that explodes combinatorically, or cases where your graph database doesn't realize there is a much simpler way of answering the query than it what it does. With the limiters out, it becomes easier to overload the database. If it's your own database you see the whole picture involving an occasional crash, rebooting the server, understanding the bottleneck vs. deciding not to write the kind of query that crashes. You can write more triples into the store, something you can only do for your own store because it is your own. You can turn it off when you don't need it and save a lot of money, also you can hit a button and reset it back to factory condition if you get it in a bad place somehow. The minimum wage here in New York is $9.00 an hour and that's a low rate globally for computer operators. The hourly cost inclusive of hardware is a small fraction of that. If it is being used by a developer on a 8-hour work day, you pay for it 1/3 what you would for running it all day. The $499 annual subscription is compared to what it might cost to create something similar yourself, such as $200 for 32GB RAM upgrade to a laptop + 8 hours for planning out the process, + 4 attempts made to produce successful release at 8 hours each (4 production + 4 testing) The cost to roll your own is then $299 + $40 * R where R is the rate. You couldn't do this legally in New York for less than $560. And that's for something without extensive optimization, which comes with no support whatsoever, etc. If you bill the annual subscription to a credit card you can use it together with service credits, reserved instances and other techniques to scale out the hardware as much as you like at a low cost. I have seen people subscribe and spend about $20 to do a quick evaluation, so if you have questions it is very worthwhile for you to try it yourself at A Step-by-step tutorial that includes some interesting example queries is here: Once you have tried it out, please leave a review on the product page so people won't just have to take my word for it. uOn 11/3/16 11:03 AM, Paul Houle wrote: Paul, Nice breakdown. uOn Wed, 02 Nov 2016 11:38:58 +0100, < > wrote: uOn Thu, 03 Nov 2016 13:33:51 +0100, Dimitris Kontokostas < > wrote: uOn Thu, 03 Nov 2016 16:35:29 +0100, Kingsley Idehen < > wrote: Is that all? Nice breakdown, as your ultimative comment on this thing? This could also come from Trump or ClintonPolitics, politics until the whole world breaks down baran u0€ *†H†÷  €0€10  `†He uOn Fri, 04 Nov 2016 16:57:35 +0100, Kingsley Idehen < > wrote: And, do you mean Pauls offer to operate with looser constraints than the public instance (which you provide) can solve some general problems you have by changing some usage specificy? That is exactly where you have to inform us what you think about this or say 'i will try it out and inform you abaut my results'! What should we think about your 'nice breakdown'? If you write 'NICE breakdown', it is normal that someone thinks, was that all? And i think you understand this. I am ofcourse ready to wait baran uOn 11/4/16 12:40 PM, wrote: I am saying: A personal or service-specific instance of DBpedia has less traffic and less volatile query mix. Once is serving the world the other a specific client application / service. We have build AMIs for years offering this option too, but folks tend to prefer the $0.00 public instance (which doesn't cost us $0.00 to publish and maintain). You can test what Paul is offering to see if it meets your needs on a basic cost vs benefit basis (the reason why he broke down the costs in his reply). uSo, you say Pauls offering has nothing to do with a dbpedia-SPARQL-endpoint serving the world, its aim is serving 'a specific client application'. Although, it semms to be 'NICE' I register this as yor your opinion to my origin posting to Paul Thanks, baran uPaul, Nice breakdown. uPaul, Nice breakdown. Cheers, Jörn u0€ *†H†÷  €0€10  `†He uOn Fri, 04 Nov 2016 20:39:28 +0100, Kingsley Idehen < > wrote: Thanks Kingsley, we checked this already, theoretical possible (though not the real aim of the offer, its real aim is serving 'a specific client application'), but if we try it out as a public endpoint, any hints and inputs how possible a better handling of 'rate limiting, time execution limits, incomplete results, abuse-attacs, bandwith etc problems' of a usual public dbpedia-SPARQL-endpoint where we land then again? i think these are problems about which we have to discuss and inform the users on a very open-minded and very easy manner, exactly in such a case I opened this thread after pauls initial posting, while it was important to me and we all have our special ideas what the problems in general case are and why Linked Data cannot set off since so many years, how many are we in this environment at all? baran." "Problem in creating Indonesian Chapter" "uHi All, I will create Indonesian Chapter of DBpedia. But, I have some problems.  Try to visit  When I visit  And the other problem, every resource link in page    Regards, Riko Hi All, I will create Indonesian Chapter of DBpedia. But, I have some problems. Try to visit Riko uHi Riko, I am not an expert on this but if you set this up with apache ProxyPass, ProxyPassReverse should also be set. If it works correctly with the current domain, it will also work with the official one. Best, Dimitris On Thu, May 16, 2013 at 5:58 PM, Riko Adi Prasetya < >wrote:" ":BaseKB is back" "uWe all know and love DBpedia, and the great news is that it goes together with Freebase data like peanut butter and jelly. The bad news is that Freebase is shutting their service down and will no longer be possible to do SQL queries against it. Fortunately we have developed a process to convert Freebase data into a compact set of RDF triples that can be queried with SPARQL to get correct answers. We've just released the \"ultimate\" version of :BaseKB which is current to the last Freebase dump Those wanting to experiment with or evaluate this product should try the AWS Marketplace image that we supply, which contains all 1.2 billion triples loaded into OpenLink Virtuoso Open Source edition, provisioned with enough RAM and Storage to outperform any public SPARQL endpoint you've ever seen." "DBpedia hosting burden" "u(trimming cc: list to LOD and DBPedia) On Wed, Apr 14, 2010 at 7:09 PM, Kingsley Idehen < > wrote: (Leigh) Have you considered blocking DBpedia crawlers more aggressively, and nudging them to alternative ways of accessing the data? While it is a shame to say 'no' to people trying to use linked data, this would be more saying 'yes, but not like that'. That's useful to know, thanks. Do you have the impression that these folk are typically trying to copy the entire thing, or to make some filtered subset (by geographical view, topic, property etc). Can studying these logs help provide different downloadable dumps that would discourage crawlers? Looking at anything discouraging crawlers. Where is the 'best practice' or 'acceptable use' advice we should all be following, to avoid putting needless burden on your servers and bandwidth? As you mention, DBpedia is an important and central resource, thanks both to the work of the Wikipedia community, and those in the DBpedia project who enrich and make available all that information. It's therefore important that the SemWeb / Linked Data community takes care to remember that these things don't come for free, that bills need paying and that de-referencing is a privilege not a right. If there are things we can do as a technology community to lower the cost of hosting / distributing such data, or to nudge consumers of it in the direction of more sustainable habits, we should do so. If there's not so much the rest of us can do but say 'thanks!', then, er, 'thanks!'. Much appreciated! Are there any scenarios around eg. BitTorrent that could be explored? What if each of the static files in were available as torrents (or magnet: URIs)? I realise that would only address part of the problem/cost, but it's a widely used technology for distributing large files; can we bend it to our needs? cheers, Dan uDan Brickley wrote: Yes. Some have cleaned up their act for sure. Problem is, there are others doing the same thing, who then complain about the instance in very generic fashion. I think we have an outstanding blog post / technical note about the DBpedia instance that hasn't been published (possibly due to the 3.5 and DBpedia-Live work we are doing), said note will cover how to work with the instance etc> Many (and to some degree quite natural) attempt to export the whole thing. Even when they're nudged to use OFFSET and LIMIT, end result is multiple hits en route to complete export. We do have a solution in mind, basically, we are going to have a different place for the descriptor resources and redirect crawlers there via 303's etc> We'll get the guide out. \"Bills\" the major operative word in a world where the \"Bill Payer\" and \"Database Maintainer\" is a footnote (at best) re. perception of what constitutes the DBpedia Project. Our own ISPs even had to get in contact with us (last quarter of 2009) re. the amount of bandwidth being consumed by DBpedia etc For us, the most important thing is perspective. DBpedia is another space on a public network, thus it can't magically rewrite the underlying physics of wide area networking where access is open to the world. Thus, we can make a note about proper behavior and explain how we protect the instance such that everyone has a chance of using it (rather than a select few resource guzzlers). When we set up the Descriptor Resource host, these would certainly be considered. Also, we encourage use of gzip over HTTP :-) Kingsley uNathan wrote: Yes. Yes. Kingsley uDan, If I were The Emperor of LOD I'd ask all grand dukes of datasources to put fresh dumps at some torrent with control of UL/DL ratio :) For reason I can't understand this idea is proposed few times per year but never tried. Other approach is to implement scalable and safe patch/diff on RDF graphs plus subscription on them. That's what I'm writing ATM. Using this toolkit, it would be quite cheap to place a local copy of LOD on any appropriate box in any workgroup. A local copy will not require any hi-end equipment for two reasons: the database can be much smaller than the public one (one may install only a subset of LOD) and it will usually less sensitive to RAM/disk ratio (small number of clients will result in better locality because any given individual tend to browse interrelated data whereas a crowd produces chaotic sequence of requests). Crawlers and mobile apps will not migrate to local copies, but some complicated queries will go away from the bottleneck server and that would be good enough. Best Regards, Ivan Mikhailov OpenLink Software http://virtuoso.openlinksw.com uOn Wed, Apr 14, 2010 at 8:11 PM, Kingsley Idehen < > wrote: They're lucky it exists at all. I'd refer them to this Louis CK sketch - (if it stays online). That sounds useful Yes, I'm sure some are thoughtless and take it for granted; but also that others are well aware of the burdens. (For that matter, I'm not myself so sure how Wikipedia cover their costs or what their longer-term plan is). This I think is something others can help with, when presenting LOD and related concepts: to encourage good habits that spread the cost of keeping this great dataset globally available. So all those making slides, tutorials, blog posts or software tools have a role to play here. Ok, let's take care to explore that then; it would probably help others too. There must be dozens of companies and research organizations who could put some bandwidth resources into this, if only there was a short guide to setting up a GUI-less bittorrent tool and configuring it appropriately. Are there any bittorrent experts on these mailing lists who could suggest next practical steps here (not necessarily dbpedia-specific)? (ah I see a reply from Ivan; copying it in here) I suspect BitTorrent is in some ways somehow 'taboo' technology, since it is most famous for being used to distributed materials that copyright-owners often don't want distributed. I have no detailed idea how torrent files are made, how trackers work, etc. I started poking around magnet: a bit recently but haven't got a sense for how solid that work is yet. Could a simple Wiki page be used for sharing torrents? (plus published hash of files elsewhere for integrity checks). What would it take to get started? Perhaps if download published (rdfa?), then others could experiment with torrents and downloaders could cross-check against an authoritative description of the file from dbpedia? Are there any RDF toolkits in need of a patch to their default setup in this regard? Tutorials that need fixing, etc? cheers, Dan ps. re big datasets, Library of Congress apparently are going to have complete twitter archive - see uRoss Singer wrote: I meant: the are sending a series of these query patterns with the same goal in mind: an export from DBpedia for import into their own Data Spaces. You can, and should use the full gamut of SPARQL queries, the issue is how they are used. On our side, we've always had the ability to protect the server. In recent times, we simply up the ante re. protection against problematic behavior. My only concern is that the tightening of control is sometimes misconstrued as a problem with the instance etc uDan, I just setup some torrent files containing the current english and german dbpedia content: (as a test/proof of concept, was just curious to see how fast a network effect via p2p networks). To try, go to I presume to get it working you need just the first people downloading (and keep spreading it around w/ their Torrent-Clients)as long as the *.torrent-files are consistent. (layout of the link page courtesy of the dbpedia-people) Kind regards, Daniel On Wed, Apr 14, 2010 at 9:04 PM, Dan Brickley < > wrote: uOn Wed, Apr 14, 2010 at 11:50 PM, Daniel Koller < > wrote: Thanks! OK, let's see if my laptop has enough disk space left ;) could you post an 'ls -l' too, so we have an idea of the file sizes? Transmission.app on OSX says \"Downloading from 1 or 1 peers\" now (for a few of them), and \"from 0 of 0 peers\" for others. Perhaps you have some limits/queue in place? Now this is where my grip on the protocol is weak uDan, At the moment there is just one seeder ;-) - and I did not manage yet to use existing storage locations of dbpedia data - so not all network capacity in the world is available; uOn Thu, Apr 15, 2010 at 10:57 AM, Daniel Koller < > wrote: :) sure! uHello, Daniel Koller schrieb: uAgree. Common errors in LOD are: uMalte Kiesel wrote: Malte, What about the EC2 AMIs we made, basically, we even pay for a DBpedia snapshot [1] (meaning no loading, just mount and go). We provide a simple loader script and VAD package Linked Data Deployment [2] via your own Virtuoso instance (you can also get Virtuoso Open Source Edition via most Linux Distros post KDE 4 release). Links: 1. 2. . We are also going to release a data sync and replication engine for RDF that basically allows you to keep subscribers in sync with publishers (directly or via hubs). uAndy Seaborne wrote: Andy, Great stuff, this is also why we are going to leave the current DBpedia 3.5 instance to stew for a while (until end of this week or a little later). DBpedia users: Now is the time to identify problems with the DBpedia 3.5 dataset dumps. We don't want to continue reloading DBpedia (Static Edition and then recalibrating DBpedia-Live) based on faulty datasets related matters, we do have other operational priorities etc uAndy Seaborne wrote: Imperfect then, however subjective that might be :-) That's been the approach thus far. Anyway, as I said, we have a window of opportunity to identify current issues prior to performing a 3.5.1 reload. I just don't want to reduce the reload cycles due to other items on our todo etc uKingsley Idehen wrote: Actually meant to say: Anyway, as I said, we have a window of opportunity to identify current issues prior to performing a 3.5.1 reload. I jwant to reduce the reload cycles due to other items on our todo etc :-) uIan Davis wrote: Ian, When you use the term: SPARQL Mirror (note: Leigh's comments yesterday re. not orienting towards this), you open up a different set of issues. I don't want to revisit SPARQL and SPARQL extensions debate etcEsp. as Virtuoso's SPARQL extensions are integral part of what makes the DBpedia SPARQL endpoint viable, amongst other things. The burden issue is basically veering away from the key points, which are: 1. Use the DBpedia instance properly 2. When the instance enforces restrictions, understand that this is a Virtuoso *feature* not a bug or server shortcoming. Beyond the dbpedia.org instance, there are other locations for: 1. Data Sets 2. SPARQL endpoints (like yours and a few others, where functionality mirroring isn't an expectation). Descriptor Resource vhandling ia mirrors, BitTorrents, Reverse Proxies, Cache directives, and some 303 heuristics etcAre the real issues of interest. Note: I can send wild SPARQL CONSTRUCTs, DESCRIBES, and HTTP GETs for Resource Descriptors to a zillion mirrors (maybe next year's April Fool's joke re. beauty of Linked Data crawling) and it will only make broaden the scope of my dysfunctional behavior. The behavior itself has to be handled (one or a zillion mirrors). Anyway, we will publish our guide for working with DBpedia very soon. I believe this will add immense clarity to this matter. uOn Thu, Apr 15, 2010 at 9:57 PM, Kingsley Idehen < > wrote: Having the same dataset available via different implementations of SPARQL can only be healthy. If certain extensions are necessary, this will only highlight their importance. If there are public services offering SPARQL-based access to the DBpedia datasets (or subsets) out there on the Web, it would be rather useful if we could have them linked from a single easy to find page, along with information about any restrictions, quirks, subsetting, or value-adding features special to that service. I suggest using a section in handle that on dbpedia.org. Yes, the showcase implementation needs to be used properly if it is going to survive the increasing attention developer LOD is getting. It is perfectly reasonable of you to make clear when there are limits they are for everyone's benefit. Is there a list somewhere of related SPARQL endpoints? (also other Wikipedia-derrived datasets in RDF) (am chatting with Daniel Koller in Skype now re the BitTorrent experiments) Sure. But on balance, more mirrors rather than fewer should benefit everyone, particularly if 'good behaviour' is documented and enforced Great! cheers, Dan uOn Thu, Apr 15, 2010 at 9:57 PM, Kingsley Idehen < > wrote: Having the same dataset available via different implementations of SPARQL can only be healthy. If certain extensions are necessary, this will only highlight their importance. If there are public services offering SPARQL-based access to the DBpedia datasets (or subsets) out there on the Web, it would be rather useful if we could have them linked from a single easy to find page, along with information about any restrictions, quirks, subsetting, or value-adding features special to that service. I suggest using a section in handle that on dbpedia.org. Yes, the showcase implementation needs to be used properly if it is going to survive the increasing attention developer LOD is getting. It is perfectly reasonable of you to make clear when there are limits they are for everyone's benefit. Is there a list somewhere of related SPARQL endpoints? (also other Wikipedia-derrived datasets in RDF) (am chatting with Daniel Koller in Skype now re the BitTorrent experiments) Sure. But on balance, more mirrors rather than fewer should benefit everyone, particularly if 'good behaviour' is documented and enforced Great! cheers, Dan uDan Brickley wrote: +1 Yep, and as promised we will publish a document, this is certainly a missing piece of the puzzle right now. See: SPARQL endpoints, at the current time. Yes, seeing progress. Yes, LinkedData DNS remains a personal aspiration of mine, but no matter what we build, enforcement needs to be understood as a *feature* rather than a bug or deficiency etc> uHugh Glaser wrote: Spidering is what we constrain so as to preserve bandwidth. Even when you spider via SPARQL we force you down the OFFSET and LIMIT route. Key point is that these are features (self protection and preservation) as opposed to bugs or shortcomings (as these issues are sometimes framed). Complex queries, absolutely not a problem, remember, this is what the \"Anytime Query\" feature is all about, its why we can host faceted navigation inside the Quad Store etcComplex queries don't chew up network bandwidth. The DBpedia SPARQL endpoint is an endpoint for handling SPARQL Queries. The Descriptor Resources that are the product of URI de-referencing are the typical targets of crawlers, at least first call before CONSTRUCT and DESCRIBE etcWe already have solutions for these resources (which includes a reverse proxy setup and cache directives etc.). In addition, we may also 303 to other locations (URLs) as part of URI de-referencing fulfillment etc> But the role of dbpedia is to provide URIs and occasional URI resolution to DBpedia instance is about providing Sa PARQL endpoint and access to Descriptor Resources (nee. Information Resources) via Data Object URI de-referencing, the instance can do both, but and enforces what it seeks to offer. We will make a guide so that everyone is clear :-) Kingsley uKarl Dubost wrote: Karl, Yes, but that means HTTP log analysis report etc Post guide, we might make time for something like that. There have been enough HTTP log requests over the months etc" "Pagelinks missing from the triple store at dbpedia.org?" "uHi, First of all - thanks for this whole effort that you're doing! Amazing stuff! Finally a perfect example of semantic web technologies in action! While I was playing with the endpoint at dbpedia.org/sparql, I realized that the pagelinks triples are not present. Are you going to add them? It would enable a new set of exciting queries. I was trying to load the pagelinks data to my local Sesame, but that made it a bit dizzy" "Introduction for GSoC 2017" "uHello devs, I am Richhiey Thomas and am studying CS in Mumbai University. I'd like to participate in GSoC 2017 with DBPedia. I went through the information for GSoC students and people new to DBpedia and had an easy time getting introduced to the project. After looking at the basic instructions for students, I've setup the DBpedia extraction framework with Intellij IDEA and am currently trying to get a hang of how the framework works by looking into the codebase and its documentation. Going through past GSoC pages and the starter pages helped me have a good start. I will try my best to start solving bugs or issues to get an idea of how things really work by the time the ideas for this year are up :) With respect to background, I had participated in GSoC 2016 with Xapian Search Engine Library where my project was based on 'Clustering of Search Results'. I also have a decent command over Python, C++ (and thus OOP) and had started learning Scala recently. So this gives me a chance to look forward to the language :D I would love to know what more I can do to get involved! Thanks. Hello devs, I am Richhiey Thomas and am studying CS in Mumbai University. I'd like to participate in GSoC 2017 with DBPedia. I went through the information for GSoC students and people new to DBpedia and had an easy time getting introduced to the project. After looking at the basic instructions for students, I've setup the DBpedia extraction framework with Intellij IDEA and am currently trying to get a hang of how the framework works by looking into the codebase and its documentation. Going through past GSoC pages and the starter pages helped me have a good start. I will try my best to start solving bugs or issues to get an idea of how things really work by the time the ideas for this year are up :) With respect to background, I had participated in GSoC 2016 with Xapian Search Engine Library where my project was based on 'Clustering of Search Results'. I also have a decent command over Python, C++ (and thus OOP) and had started learning Scala recently. So this gives me a chance to look forward to the language :D I would love to know what more I can do to get involved! Thanks." "Movie reviews using DBpedia" "uOn Tue, 15 May 2007 20:23:08 +0200 \"Georgi Kobilarov\" < > wrote: Hello Georgia, I am glad to hear from you about my project! BTW, how did you heard about it? Currently, I haven't done that much because I have to study for my end of schoolar-year exams. I am now focusing on the UI. At the moment, there is only a search form, performing a research on dbpedia.org, performing the fellowing SPARQL request : [[[ SELECT ?film ?title ?director ?writer ?d ?w WHERE { ?film . ?film \" \". ?film ?title. ?film ?d. ?d ?director. ?film ?w. ?w ?writer . } ]]] So, it only works for American films. It'd be nice to see a new rdf:type indicating that it's a film, whetever it is american or not. Here is some details about my implementation : - It's written in the Ruby language, using Camping [1] as a framework and ActiveRDF [2] to access dbpedia's SPARQL end-point. - At the beginning, I am not planning to use a triple store. I'll use basic database to store informations (films and reviews) I'll try to clean up the code and publish it somewhere ASAP. Looking forward to hearing back from you :-) [1] [2] http://activerdf.org uHi Simon, Well, at times I google for \"dbpedia\", and that showed up your wiki-entry Any preview-version available yet? We uploaded such rdf:type statements last week. The class for films is See Please note that the namespace might change to dbpedia.org/class/ at some time. I'd suggest you have a look at D2R-Server [1]. It is tool to publish/serve relational databases as linked data / RDF. Nice, I'm looking forward to that. Cheers Georgi [1]" "DBpedia Spotlight v0.5 Released (Text Annotation with DBpedia)" "uHi all, We are happy to announce the release of DBpedia Spotlight v0.5 - Shedding Light on the Web of Documents. DBpedia Spotlight is a tool for annotating mentions of DBpedia entities and concepts in text, providing a solution for linking unstructured information sources to the Linked Open Data cloud through DBpedia. The DBpedia Spotlight Architecture is composed by the following modules: * Web application, a demonstration client (HTML/Javascript UI) that allows users to enter/paste text into a Web browser and visualize the resulting annotated text. * Web Service, a RESTful Web API that exposes the functionality of annotating and/or disambiguating resources in text. The service returns XML, JSON or XHTML+RDFa. * Annotation Java / Scala API, exposing the underlying logic that performs the annotation/disambiguation. * Indexing Java / Scala API, executing the data processing necessary to enable the annotation/disambiguation algorithms used. In this release we have provided many enhancements to the Web Service, installation process, as well as the spotting, candidate selection, disambiguation and annotation stages. More details on the enhancements are provided below. The new version is deployed at: * * Instructions on how to use the Web Service are available at: We invite your comments on the new version before we deploy it on our production server. We will keep it on the \"dev\" server until October 6th, when we will finally make the switch to the production server at If you are a user of DBpedia Spotlight, please join for announcements and other discussions. Changelog Changes since last public release (v0.1): * Uses DBpedia 3.7 resources, including types from DBpedia Ontology, Freebase and Schema.org. * New Web API method /rest/candidates provides a ranked list of candidates for each surface form. This will allow the use of DBpedia Spotlight in semi-automatic annotation (e.g. of blog posts), where users can \"fix\" a mistake made by our system by choosing another candidate from the suggestions provided by the service. * New disambiguation implementations, including a two-step disambiguator with simpler context scoring provides up to 200x faster annotation with modest accuracy loss (~7%) in our preliminary tests. * SpotSelector classes allow one to discard non-entities early in the process to improve time performance and conformance with annotation policies (e.g. do not annotate common words). * jQuery plugin for DBpedia Spotlight allows one to annotate a Web page with one line of javascript code: $('div').annotate(); * Cross Origin Resource Sharing (CORS) is now enabled by default on the Web API, allowing javascript code on your page to call our service without need for proxies. * Enhanced candidate selection stage (with approximate matching) improves coverage of candidate URIs for surface forms with small variations in spelling. * Debian packaging allows one to install DBpedia Spotlight via the package manager in many Linux distros. * Easier installation: fully mavenized process, auto-generated jars, more configuration parameters accessible via property files. * Better modularization: dependence on the DBpedia Extraction Framework was moved to module \"index\". Users that only want to run the service can now ignore that dependence. * Web API description provided via Web Application Description Language (WADL). It allows you to create clients automatically via IDEs such as Eclipse, NetBeans, etc. * Downloads: full index, compressed index, spotter dictionaries with different thresholds, etc. available from * Removed restriction on the number of characters. Beware that short texts will have lower performance since they normally provide less context for disambiguation. * Accepts POST requests in addition to GET. This allows longer text. Unless explicitly specified, long texts automatically use the Document-centric (faster) disambiguation algorithm. * A bookmarklet allows user to select text in any Web page using their good old browser and call DBpedia Spotlight directly from there in order to obtain annotated text. Acknowledgements Many thanks to the growing community of DBpedia Spotlight users for your feedback and energetic support. We would like to especially thank: * Jo Daiber for his great work on better spotters, additional types, cuter interfaces and many other improvements to the tool; * Paul Houle for the extensive feedback on the system, great suggestions for improvement and patches; * Scott White for the invaluable discussions on the architecture and other advice; * Rob DiCiuccio for his real-world use case description and PHP client implementation; * Giuseppe Rizzo for his friendly push for releasing the /candidates API and feedback on the API's design; * Thomas Steiner and Rainer Simon for opening the Known Uses list ( Cano et al., Ali Khalili, Raphaël Troncy and Giuseppe Rizzo for letting us know of their uses of DBpedia Spotlight. With this release we also have the pleasure of welcoming Jo Daiber as a committer. We are looking forward to continuing this fruitful collaboration. This release of DBpedia Spotlight was supported by The European Commission through the project LOD2 – Creating Knowledge out of Linked Data ( DBpedia Spotlight's source code is provided under the terms of the Apache License, Version 2.0. Part of the code uses LingPipe under the Royalty Free License. The source code can be downloaded from: A paper describing DBpedia Spotlight was published at I-SEMANTICS 2011: Pablo N. Mendes, Max Jakob, Andrés García-Silva and Christian Bizer. DBpedia Spotlight: Shedding Light on the Web of Documents. In the Proceedings of the 7th International Conference on Semantic Systems (I-Semantics). Graz, Austria, 7–9 September 2011. Happy annotating! Cheers, Pablo, Max, Jo, Chris Hi all, We are happy to announce the release of DBpedia Spotlight v0.5 - Shedding Light on the Web of Documents. DBpedia Spotlight is a tool for annotating mentions of DBpedia entities and concepts in text, providing a solution for linking unstructured information sources to the Linked Open Data cloud through DBpedia. The DBpedia Spotlight Architecture is composed by the following modules: * Web application, a demonstration client (HTML/Javascript UI) that allows users to enter/paste text into a Web browser and visualize the resulting annotated text. * Web Service, a RESTful Web API that exposes the functionality of annotating and/or disambiguating resources in text. The service returns XML, JSON or XHTML+RDFa. * Annotation Java / Scala API, exposing the underlying logic that performs the annotation/disambiguation. * Indexing Java / Scala API, executing the data processing necessary to enable the annotation/disambiguation algorithms used. In this release we have provided many enhancements to the Web Service, installation process, as well as the spotting, candidate selection, disambiguation and annotation stages. More details on the enhancements are provided below. The new version is deployed at: * http://spotlight.dbpedia.org/dev/demo/ (Demonstration Web Interface) * http://spotlight.dbpedia.org/dev/rest/ (Web Service) Instructions on how to use the Web Service are available at: http://spotlight.dbpedia.org We invite your comments on the new version before we deploy it on our production server. We will keep it on the 'dev' server until October 6th, when we will finally make the switch to the production server at http://spotlight.dbpedia.org/demo/ and http://spotlight.dbpedia.org/rest/ If you are a user of DBpedia Spotlight, please join for announcements and other discussions. Changelog Changes since last public release (v0.1): * Uses DBpedia 3.7 resources, including types from DBpedia Ontology, Freebase and Schema.org. * New Web API method /rest/candidates provides a ranked list of candidates for each surface form. This will allow the use of DBpedia Spotlight in semi-automatic annotation (e.g. of blog posts), where users can 'fix' a mistake made by our system by choosing another candidate from the suggestions provided by the service. * New disambiguation implementations, including a two-step disambiguator with simpler context scoring provides up to 200x faster annotation with modest accuracy loss (~7%) in our preliminary tests. * SpotSelector classes allow one to discard non-entities early in the process to improve time performance and conformance with annotation policies (e.g. do not annotate common words). * jQuery plugin for DBpedia Spotlight allows one to annotate a Web page with one line of javascript code: $('div').annotate(); * Cross Origin Resource Sharing (CORS) is now enabled by default on the Web API, allowing javascript code on your page to call our service without need for proxies. * Enhanced candidate selection stage (with approximate matching) improves coverage of candidate URIs for surface forms with small variations in spelling. * Debian packaging allows one to install DBpedia Spotlight via the package manager in many Linux distros. * Easier installation: fully mavenized process, auto-generated jars, more configuration parameters accessible via property files. * Better modularization: dependence on the DBpedia Extraction Framework was moved to module 'index'. Users that only want to run the service can now ignore that dependence. * Web API description provided via Web Application Description Language (WADL). It allows you to create clients automatically via IDEs such as Eclipse, NetBeans, etc. * Downloads: full index, compressed index, spotter dictionaries with different thresholds, etc. available from http://spotlight.dbpedia.org/download * Removed restriction on the number of characters. Beware that short texts will have lower performance since they normally provide less context for disambiguation. * Accepts POST requests in addition to GET. This allows longer text. Unless explicitly specified, long texts automatically use the Document-centric (faster) disambiguation algorithm. * A bookmarklet allows user to select text in any Web page using their good old browser and call DBpedia Spotlight directly from there in order to obtain annotated text. Acknowledgements Many thanks to the growing community of DBpedia Spotlight users for your feedback and energetic support. We would like to especially thank: * Jo Daiber for his great work on better spotters, additional types, cuter interfaces and many other improvements to the tool; * Paul Houle for the extensive feedback on the system, great suggestions for improvement and patches; * Scott White for the invaluable discussions on the architecture and other advice; * Rob DiCiuccio for his real-world use case description and PHP client implementation; * Giuseppe Rizzo for his friendly push for releasing the /candidates API and feedback on the API's design; * Thomas Steiner and Rainer Simon for opening the Known Uses list ( http://dbpedia.org/spotlight/knownuses ), and Rob DiCiuccio, A. Elizabeth Cano et al., Ali Khalili, Raphaël Troncy and Giuseppe Rizzo for letting us know of their uses of DBpedia Spotlight. With this release we also have the pleasure of welcoming Jo Daiber as a committer. We are looking forward to continuing this fruitful collaboration. This release of DBpedia Spotlight was supported by The European Commission through the project LOD2 – Creating Knowledge out of Linked Data ( http://lod2.eu/ ). DBpedia Spotlight's source code is provided under the terms of the Apache License, Version 2.0. Part of the code uses LingPipe under the Royalty Free License. The source code can be downloaded from: http://sourceforge.net/projects/dbp-spotlight A paper describing DBpedia Spotlight was published at I-SEMANTICS 2011: Pablo N. Mendes, Max Jakob, Andrés García-Silva and Christian Bizer. DBpedia Spotlight: Shedding Light on the Web of Documents. In the Proceedings of the 7th International Conference on Semantic Systems (I-Semantics). Graz, Austria, 7–9 September 2011. Happy annotating! Cheers, Pablo, Max, Jo, Chris" "DBpedia 3.4 released" "uAzamatAbdoullaev wrote: I'd be interested in hearing your proposal to fix this \"mess\". Apparently, you have a different notion of the term \"ontology\" - which is not surprising, given your definition which probably resembles \"the ontology is a conceptualization of the world\" has been the prevalent one for a long time. These days, a more appropriate definition seems to be \"An ontology is the formalization of a conceptualization\". The DBpedia ontology definitely fits this definition. I found a blog which has some great explanations [1]. Things do not in cycles at all. :) Regards, Michael [1] some-great-w3c-explanations-of.html" "Extracting audio files from Wikipedia (was: Multi-listen_item)" "uYves, At the time of the previous extraction, there were some audio samples embedded in J.S. Bach's article page. They have since moved to their own separate page: We don't currently have an extractor for audio files. This would require a new extractor, which could be patterned after the ImageExtractor, since audio file embedding in Wikipedia uses the syntax as image embedding. I have done some preliminary investigation here: If you want to work with the audio files without waiting for a new extractor: This link should have all the information you need to reconstruct the audio file's URL. Best, Richard On 31 Jan 2008, at 12:02, Yves Raimond wrote: uOn 05/02/2008, Richard Cyganiak < > wrote: Hi, I'm new to this list. I have some significant experience with Wikipedia ( & MediaWiki) now so I hope I can help. There is a much simpler way to find the file URL than figuring out the file hash yourself. :) As you noticed audio files are linked the same way as images, something like [[Image:Johann Sebastian Bach - The Well-tempered Clavier - Book 1 - 02Epre cmaj.ogg]] links to the image page. [[Media:Johann Sebastian Bach - The Well-tempered Clavier - Book 1 - 02Epre cmaj.ogg]] links directly to the file, but since the new browser audio/video player, probably that is less common. However you can get the path directly like this: < > ie replace \"Image:\" with \"Special:Filepath/\". I think this is what you are trying to do? cheers, Brianna uBrianna, On 5 Feb 2008, at 08:24, Brianna Laugher wrote: Welcome! That's indeed much simpler. Thanks! :-) This seems to work with images too. Can it also be used to link to scaled-down thumbnails? Richard uOn 05/02/2008, Richard Cyganiak < > wrote: Nohm. Let me do a bit of investigation on that and get back to you. :) You could do this: This generates a 200px-width thumbnail. However I believe it bypasses the cache, so it's not appropriate to use this to hotlink a thumbnail. If you did it once per image to make the thumbnail and then save it locally I think that would be OK. cheers Brianna uOn 06/02/2008, Brianna Laugher < > wrote: Oh and I meant to sayyou may need to try doing that on en.wikipedia.org first and it that doesn't work, then commons.wikimedia.org . I don't know that there's an easy way to check if an image has been \"really\" uploaded to en.wp or Commons, because images on Commons work transparently *as if* they were on en.wp. cheers Brianna uOn 6 Feb 2008, at 03:23, Brianna Laugher wrote: Ok, we don't want to save the images locally, so I suppose it makes sense to keep our current behaviour for images (check in the en.wp database first; if it's not there assume it's in commons; construct the cached thumbnail URI by doing the md5 hash magic). Thanks a lot, Brianna! Richard uOn 07/02/2008, Richard Cyganiak < > wrote: OK, you can do these things using the API: gives you the URL of the thumbnail. shows you imagerepository=\"shared\" or \"local\" so that tells you if the image is at en.wikipedia or commons. So yay for the API! cheers, Brianna" "semantic in URIs, was: dbpedia-links: Recommendation for predicate "rdrel:manifestationOfWork" ?" "uLe 2013-05-07 13:42, Pascal Christoph a écrit : I probably put semantic everywhere that I can, it's not like it was something I do on purpose. In my humble opinion, that's natural human tendancy. I was actually just trying to gather information on this particular project architecture choices, not on URI specifications. So while I really appreciate that you took time to answer me, I didn't find what I was seeking in links you provided." "Why is the OWL ontology in RDF/XML?" "uA minor inconsistency I've noticed in dbpedia is that the OWL ontology is represented in RDF/XML, while the rest of dbpedia is in NT. I like NT. I've got a special-purpose NT parser that works very well with dbpedia. (I found that many commercial & OS RDF tools can't handle the dbpedia dumps; at some point I realized I have to parse the files in front of me rather than support a \"standard\") Any chance we could get the OWL ontology in NT as well? uPaul Houle wrote: Paul, Why not convert the RDF/XML to N3? uHello, Paul Houle schrieb: It can be converted of course: Kind regards, Jens uJens Lehmann wrote: I ran it through a converter last night and got a document that, like yours, contained blank nodes. These are implicit in the RDF-XML, but need to be named in order to be serialized as NT. That's one substantial difference between the OWL ontology and the rest of dbpedia. uHello, Paul Houle schrieb: That's true. If you do not like the blank nodes, you could perform a string replace, e.g. \"_:genid\" replaced by \" In general, an OWL axiom may require several RDF triples to represent it. Usually, this is done by introducing blank nodes. Kind regards, Jens" "DBpedia and Wikidata together" "uKingsley, I am very much heartened by your response because I thought you're a die-hard DBpedia guy; and a couple months ago when Markus pointed to the excellent Resumator, you sort of said \"but does it LOD\". Wikidatians have strength in numbers. As Multichill (Sum of All Paintings, WikiCulture, WikiLovesMonuments) is just emailing me \"We have bots and we're not afraid to use them. My bot (BotMultichill) has over 4 million edits\" They're not afraid to screw some modeling up, because they're gonna fix it. They talk to each other, all the time, in sophisticated ways. (Well, not all of them are sophisticated: e.g. I am just learning) Mediawiki is the most sophisticated consensus-building platform, right? They're good with practical information architecture (things shold be easy to use and see), without being afraid of breaking some puny rule. They're not afraid to use what used to be a cursed languge (PHP). Phabricator is a great10 tools in one? I only used it couple days but the UI is excellent. And they are getting serious about RDF: they're retooling the WDQ backend because that wonderful beast is ever hungry. (They picked BigData because of free support and strong engagement). I don't quite clearly see the Complementarity between DBP and WD that you see, but there are options. What I see that there's got to be a joint series of workshops. Maybe also with the SemanticMediaWiki (or OntoWiki) people, because they have the third perspective. Who's for it? I sure would be happy: will cut down on travel :-) PS: Too bad this thread wont go to wikidata-l because I'm not subscribed there. uOn 3/30/15 7:01 PM, Vladimir Alexiev wrote: That's an example of how my comments can be easily misconstrued, due to my DBpedia and LOD proximity :) All I want is for identifiers to resolve to description documents. Net effect, we have URIs functioning as terms rather than words and/or phrases, expands the power of the Web. All good. \"A\" rather than \"The\" :) Yes! Not being afraid to break things is crucial. A system of draconian rules is DOA. That said, Linked Open Data isn't draconian, but I do empathize with those who might initially arrive at that conclusion. And that's the weak point that needs fixing. Wikidata is about a crowd oriented DataWiki for structured data. DBpedia is about Wikipedia content rendition in 5-Star Linked Open Data form. Wikidata enables DBpedia to be better. DBpedia enables Wikidata to be better, even if DBpedia shortcomings are the initial point of focus. Yes, but not just workshops. Principals behind both projects need to talk and get to understand one another, genuinely. As I said, these projects are inherently complimentary+++ Both are DataWiki variants. They are complimentary tools that can play a role. Just like related tools that offer read-write capability at Web-scale via the open standards such as LDP, SPARQL Graph Protocol, SPARQL 1.1 Update etc It's in the cc. :) Kingsley" "Some images not found" "uI am using image dataset from dbpedia (3.7). And, I found that some images urls do not exist. By example: How to find the good image for a resource ? How to detect when an URL does not exist anymore ? Thanks I am using image dataset from dbpedia (3.7). And, I found that some images urls do not exist. By example: Thanks uHi Didier, On 06/08/2012 07:06 PM, didier rano wrote: Actually, there was a bug in the ImageExtractor, which is now fixed. So, this issue will vanish in the next DBpedia version. You can also use DBpedia-Live available at \" well. Hope that helps. uHi Didier, On 06/08/2012 07:06 PM, didier rano wrote: Actually, there was a bug in the ImageExtractor, which is now fixed. So, this issue will vanish in the next DBpedia version. You can also use DBpedia-Live available at \" Hope that helps. uAm 08.06.2012 19:06 schrieb didier rano: You may use a Linkcchecker (e.g. [1]) to verify if the images exist. As I experienced the same (often 404 for image URLs) I would propose that in the process of _generating dbpedia_ to use a such a link checker in the first place! oo [1]" "wiktionary.dbpedia.org online - Linked Data, SPARQL and Dumps" "uHi lists, we are proud to announce that we now host the data we extract from wiktionary publicly on wiktionary.dbpedia.org. We offer Linked Data: a SPARQL endpoint: and N-Triple Dumps: There is also a wiki explaining some details: We currently extracted data from the English and German Wiktionary (28M triples and 3.7M triples), but plan to extend that to at least the biggest 5 wiktionaries within the next weeks, as our approach focuses on extendability. The data for each word is structured hierarchically (as wiktionary is) and contains information about language, part of speech, definitions, translations, synonyms, hyperonyms and hyponyms etc. There might be some quality issues, but we want to release early, so bear with us and report major problems. Thanks goes to the wiktionary community which does a great job creating this dataset, and we hope to enable new use cases and consequently promote the contribution to the wiktionary project. Regards, Jonas Brekle Department of Computer Science, University of Leipzig Research Group: http://aksw.org uHi Jonas Great resource! I'm curious, though, about the vocabulary (predicates) used, such as The above URIs are not dereferencable, at least not to a usable description, formal or not, neither is the namespace Will this vocabulary be published at some point? And did you consider reusing existing predicates from existing vocabularies? such as or other listed at Best regards Bernard Le 13 mars 2012 13:22, Jonas Brekle < > a écrit : uOn 13 March 2012 12:22, Jonas Brekle < > wrote: CCs trimmed to the list I'm on That URL redirects to: which makes no mention of Wiktionary. uGreat idea! What about extracting also the cateogories via dcterms:subject/skosk:broader properties as for the \"regular\" DBpedia? regards, roberto Il 13/03/2012 13:22, Jonas Brekle ha scritto: uAm Dienstag, den 13.03.2012, 15:05 +0000 schrieb Andy Mabbett: yes we got no project website yet. just the data. i will make one soon. uAm Dienstag, den 13.03.2012, 14:48 +0100 schrieb Bernard Vatant: yes, we need to fix this soon. to be honest this is just a dummy vocabulary until we decide what to reuse. yes, some of these. also the schema might change: the data is very hierarchical and we might (additionally?) transform it to \"word\" -> \"senses\". uHi Jonas Great if this is in your roadmap! I was afraid this was about to be yet-another-great-dataset-without-retrievable-vocabulary :) Please ping LOV when it's done ;-) Bernard Le 13 mars 2012 18:57, Jonas Brekle < > a écrit :" "DBPedia Import" "uHello everybody, I'm new with DBpedia and need some basic advises. I want to import the dbpedia in Protege and to query it. I'm getting just the classes and I don't know how to get the datasets. It's a very basic question, I know, but could you help me with this please! Thank you! My very best! M.Chukanska Hello everybody, I'm new with DBpedia and need some basic advises. I want to import the dbpedia in Protege and to query it. I'm getting just the classes and I don't know how to get the datasets. It's a very basic question, I know, but could you help me with this please! Thank you! My very best! M.Chukanska uHi Mariana, This is how I started: all the best, Parisa From: Mariana Chukanska [mailto: ] Sent: donderdag 17 juli 2014 11:45 To: Subject: [Dbpedia-discussion] DBPedia Import Hello everybody, I'm new with DBpedia and need some basic advises. I want to import the dbpedia in Protege and to query it. I'm getting just the classes and I don't know how to get the datasets. It's a very basic question, I know, but could you help me with this please! Thank you! My very best! M.Chukanska" "SNORQL and SPARQL endpoints returning different results" "uIn the DBpedia SPARQL endpoint [1], running PREFIX dc: PREFIX : PREFIX dbpedia2: SELECT ?a (3+3 AS ?y) WHERE { ?a dc:description \"English footballer\" . ?a dbpedia2:placeOfBirth :Merseyside . } Shows all English Footballers who were born in Merseyside, with column y just displaying the value 6 on every row [result link ]; however, the same query on the SNORQL endpoint displays an error: Virtuoso 37000 Error SP030: SPARQL compiler, line 16: syntax error at '3' before 'AS' SPARQL query: define sql:big-data-const 0 #output- format:application/sparql-results+json define input:default-graph-uri PREFIX owl: PREFIX xsd: PREFIX rdfs: PREFIX rdf: PREFIX foaf: PREFIX dc: PREFIX : PREFIX dbpedia2: PREFIX dbpedia: PREFIX skos: PREFIX pos: PREFIX dbo: SELECT ?a (3 3 AS ?y) WHERE { ?a dc:description \"English footballer\" . ?a dbpedia2:placeOfBirth :Merseyside . } Even more strangely, using any of the other 3 arithmetic operators *does* work in the SNORQL endpoint (e.g. with division [5]) A previous question [6] on Stackoverflow has implied that the SPARQL and SNORQL endpoints should return the same result, so what's going on here?! Cheers, Chris 1: 2: 3: 4: 5: 6: PREFIX : < PREFIX dbpedia2: < SELECT ?a (3+3 AS ?y) WHERE { ?a dc:description 'English footballer' . ?a dbpedia2:placeOfBirth :Merseyside . } Shows all English Footballers  who were born in Merseyside, with column y  just displaying the value 6  on every row [ result link ]; however, the same query  on the SNORQL endpoint  displays an error: Virtuoso 37000 Error SP030: SPARQL compiler, line 16: syntax error at '3' before 'AS' SPARQL query: define sql:big-data-const 0 #output- format:application/sparql-results+json define input:default-graph-uri PREFIX owl: PREFIX xsd: PREFIX rdfs: PREFIX rdf: PREFIX foaf: PREFIX dc: PREFIX : PREFIX dbpedia2: PREFIX dbpedia: PREFIX skos: PREFIX pos: PREFIX dbo: SELECT ?a (3 3 AS ?y) WHERE { ?a dc:description 'English footballer' . ?a dbpedia2:placeOfBirth :Merseyside . } Even more strangely, using any of the other 3 arithmetic operators does  work in the SNORQL endpoint (e.g. with division  [5]) A previous question  [6]  on Stackoverflow has implied that the SPARQL and SNORQL endpoints should return the same result, so what's going on here?! Cheers, Chris 1: http://dbpedia.org/sparql/ 2: http://dbpedia.org/sparql/?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=++++PREFIX+dc%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%3E%0D%0A++++PREFIX+%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2F%3E%0D%0A++++PREFIX+dbpedia2%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2F%3E%0D%0A%0D%0A++++SELECT+%3Fa+%283%2B3+AS+%3Fy%29%0D%0A++++WHERE+%0D%0A++++%7B+%0D%0A+++++++%3Fa+dc%3Adescription+%22English+footballer%22+.%0D%0A+++++++%3Fa+dbpedia2%3AplaceOfBirth+%3AMerseyside+.%0D%0A++++%7D&format=text%2Fhtml&timeout=30000&debug=on 3: http://dbpedia.org/snorql/?query=PREFIX+pos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2003%2F01%2Fgeo%2Fwgs84_pos%23%3E%0D%0APREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0D%0A%0D%0ASELECT+%3Fa+%283%2B3+AS+%3Fy%29%0D%0AWHERE+%0D%0A%7B+%0D%0A+++++%3Fa+dc%3Adescription+%22English+footballer%22+.%0D%0A+++++%3Fa+dbpedia2%3AplaceOfBirth+%3AMerseyside+.%0D%0A%7D 4: http://dbpedia.org/snorql/ 5: http://dbpedia.org/snorql/?query=PREFIX+pos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2003%2F01%2Fgeo%2Fwgs84_pos%23%3E%0D%0APREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0D%0A%0D%0ASELECT+%3Fa+%283%2F3+AS+%3Fy%29%0D%0AWHERE+%0D%0A%7B+%0D%0A+++++%3Fa+dc%3Adescription+%22English+footballer%22+.%0D%0A+++++%3Fa+dbpedia2%3AplaceOfBirth+%3AMerseyside+.%0D%0A%7D 6: http://stackoverflow.com/a/15658884/889604 uJust a sidenote - using pastebin.com or whatever other paste tool would make the question 10 times more readable. Cheers, Aleksander uIn the interest of avoiding any repeated effort, note that this was also asked on Stack Overflow as: On Thu, Feb 19, 2015 at 11:44 AM, Chris Wood < > wrote: u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uHi Chris, The snorql code was using the wrong method to encode the query string before calling the real /sparql interface. It was using the very old 'escape()' javascript function which does not encode the + character as a special character. It then uses that escaped string to call the real /sparql endpoint. When the /sparql endpoint decodes the &query;=XXXX3+3YYYYY string it decodes the + character as a space thereby creating a syntax error in your query. I fixed the snorql code on http://dbpedia.org to use the encodeURIComponent function which is the proper way of embedding a query as a parameter argument in a URL. Patrick uOn 2/19/15 4:49 PM, Patrick van Kleef wrote: Fix confirmed: http://dbpedia.org/snorql/?query=PREFIX+pos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2003%2F01%2Fgeo%2Fwgs84_pos%23%3E%0D%0APREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0D%0A%0D%0ASELECT+%3Fa+%283%2B3+AS+%3Fy%29%0D%0AWHERE+%0D%0A%7B+%0D%0A+++++%3Fa+dc%3Adescription+%22English+footballer%22+.%0D%0A+++++%3Fa+dbpedia2%3AplaceOfBirth+%3AMerseyside+.%0D%0A%7D Hopefully, someone could help closing out the case on Stackoverflow [1] ? [1] http://linkeddata.uriburner.com/about/id/entity/http/stackoverflow.com/questions/28611127/dbpedias-sparql-and-snorql-returning-different-results . uHi Patrick That's awesome - thanks for sorting it so quickly! (And apologies for not using pastebin originally, I hadn't realised my code would get so mangled in some mail clients) Cheers, Chris On 19 February 2015 at 21:49, Patrick van Kleef < > wrote:" "Dbpedia growth trends" "uHi everybody: I am a master student at Saarland University (Germany) who is working with semantic databases (specifically efficient partitioning) and I was wondering if there is available information about the growth of the dbpedia datasets in order to somehow justify my work. The webpage says that there are 3.500.000 resources by Jan 2010 but it would be great if I can show the growth trend. I suspect there is a positive correlation with the growth of wikipedia articles but I think it would be better if I show directly the amount of semantic information. Any kind of help will be kindly appreciated. Thanks in advance for your attention and help and keep the good job! All the best, Luis Galárraga Hi everybody: I am a master student at Saarland University (Germany) who is working with semantic databases (specifically efficient partitioning) and I was wondering if there is available information about the growth of the dbpedia datasets in order to somehow justify my work. The webpage says that there are 3.500.000 resources by Jan 2010 but it would be great if I can show the growth trend. I suspect there is a positive correlation with the growth of wikipedia articles but I think it would be better if I show directly the amount of semantic information. Any kind of help will be kindly appreciated. Thanks in advance for your attention and help and keep the good job! All the best, Luis Galárraga uHello, Am 29.06.2011 15:23, schrieb Luis Galárraga: We kept all previous releases at this would be the easiest way to show the growth of DBpedia. Please let us know your results. Here is the size of the folders (which is not the most accurate measure because there are several reasons why the size can change apart from more extracted information): 1.8G ./1.0 2.5G ./2.0 7.6G ./3.0rc 5.1G ./3.0 6.0G ./3.1 6.4G ./3.2 7.3G ./3.3 21G ./3.4 32G ./3.5 35G ./3.5.1 34G ./3.6 Kind regards, Jens u‰PNG  u2011/6/30 Luis Galárraga < >: Didn't you connect those data points with Bezier curves? It makes the two Q1 2010 points (which are almost? directly over each other) look like they're going back in time. Straight lines would probably be more appropriate Tom u‰PNG " "Inconsistent results from sparql queries (David Spacey)" "uWe (in the BBC Search team) are still seeing the problem below. To recap, Dave wrote: For example using a simple SPARQL query like SELECT ?abstract WHERE { ?abstract . FILTER ( langMatches( lang(?abstract), 'en') || ! langMatches(lang(?abstract),'*') ) } on dbepdia.org/sparql gives the abstract that you¹d expect; running it on dbpedia-live.openlinksw.com/sparql doesn¹t find it. However if you use abstract_live rather than abstract then it does find something ­ although surprisingly it finds two different abstracts! Just to be explicit, running this query SELECT ?abstract WHERE { ?abstract . FILTER ( langMatches( lang(?abstract), 'en') || ! langMatches(lang(?abstract),'*') ) } on dbpedia-live.openlinksw.com/sparql does give a result, in fact it gives two! So really I have three main questions (although any information would be useful): 1. What is the 2. Cam we safely use it instead of 3. Why does it have more than one value? We are seeing this on around 20% of the resources that we are interested in, and is preventing us from using the Live DBPedia at present. Thanks very much, Stephn. On 15/1/10 12:06, \" \" < > wrote: uHello, The issue is hard to explain, as there are still quite some bugs on dbpedia-live, which we only found after letting it run for a while. We optimized the speed and will soon reload DBpedia 3.4 and also load all changes since September, which will fix the missing or double abstracts. Here is how it should work: Every page has static abstracts (the ones you know). These abstracts are the same as as in 3.4 and will remain static. abstract_live is the abstract extracted for the last revision. The main reason why there are two is, that there are two different responsible extractors: - A slow one, which produces abstracts with better quality and produced the abstracts for 3.4. - A fast one (factor 10-100) , which produces abstracts with more or less acceptable quality and produces comment_live and abstract_live Once we improve the speed of the \"better\" AbstractExtractor, they will be merged again, but this could take quite some time as it is a complicated and expensive process( It involves parsing Wiki syntax, extending a MediaWiki and synchronizing template definitions) Perhaps, it will even stay like that and we will only refresh the static abstract information with each available Wikipedia dump. So there would always be a live english (for now) version and a static one with better quality and for all languages. I think, it might not be the worst solution. Regards, Sebastian Stephen Betts schrieb: uHi Sebastian, Thank you very much for the detailed response ­ that makes a lot of sense. So it sounds like we could probably use the abstract_live if there is not abstract in the DBpedia record, and use the proper abstract when it becomes available. Do you think the bugs that you mention should stop us using DBpedia Live at present, or are they things that we can work around at present (as I suggest above) and they will improve in future? Basically I¹m asking whether the bugs are serious enough for you to advise holding off on using DBpedia Live at present. Of course, if we should hold off for the minute, some idea of when those bugs would be resolved sufficiently for us to use DBpedia Live would be really helpful, but I know that may be difficult to say at present. Thanks again. Yours, Stephen. On 2/3/10 18:19, \"Sebastian Hellmann\" < > wrote: uStephen Betts schrieb: Yes, it is quite difficult to say. I will get back to you next week, as I can better answer your question then. Basically, if we fix the issue that some things are not deleted properly resulting in e.g. double abstracts, we would have a more or less stable version. Then the current framework uses a basic mapping for the dbpedia.org/ontology namespace, which also has some bugs. We are testing both currently and if it is resolved, there will be a stable version. As I said, next week I can give you an estimate. Regards, Sebastian" "Name Clash - Error on mvn install of DBpedia Extraction Framework" "uHello dbpedia-community! I'm having troubles with mvn install of the DEF and I hope that someone here can help me. If I don't comment out the Live Module from extraction_framework's POM-File mvn install won't complete successfully. That's what it complains about: [INFO] uHi David , On 02/07/2012 07:28 PM, David Gösenbauer wrote:" "Link Dbpedia and freebase relations" "uHi, I'd like to find out which dbpedia ontology relations (e.g., birthDate) correspond to which freebase relations. Has this mapping been established, or would it have to be done manually for relations of interest? Thanks, Nicholas uHi Nicholas, there are currently only instance-level links between Freebase and DBpedia and no mapping on schema-level. Having such a mapping would be great. So, if you work into this direction, please send us your results, so that we can publish them together with Dbpedia. Kind regards, Chris uHi Nicholas, There isn't currently such a mapping, but I suspect that it would be pretty easy to generate it. Most DBpedia topics can be mapped to Freebase topics, and between any two connected topics in both datasets, there usually is only a single Freebase relation and DBpedia relation. It is straightforward to infer that these relations are the same, especially if the same relations co-occur many times across many topics. -Colin On Jun 10, 2009, at 10:45 PM, Nicholas O. Andrews wrote: uColin Evans wrote: Colin, What's the freebase ontology? Do this exist in RDFS or OWL form? Kingsley uI don't think so. They have their own type system. On Thu, Jun 11, 2009 at 2:37 PM, Kingsley Idehen< > wrote: uNicholas O. Andrews wrote: Yes, my response was part loaded, and very much \"tongue in cheek\" :-) Hopefully, the magnitude of the task is somewhat clearer. Kingsley uI'm only interested in a small number of relations, so it's not too bad. That said, I agree with Colin that it should be straightforward to infer the relations given linked entities. On Thu, Jun 11, 2009 at 2:46 PM, Kingsley Idehen< > wrote: uKingsley - You can obtain an RDFS view of all the Freebase Types in a Domain by using the Acre Application at To see all of the Types in the Film domain you can use the URI: Since the application is written in Acre, anyone can clone the application and customize it as needed. I've been using the service to build apps that make use of Freebase vocabularies in RDFa and it seems to work well, but I'd be interested in seeing what other folks in the community might do with it. The application also provides experimental support for generating RDFS from the vocab.freebase.com Base. So for instance to see the Google Data-Vocabulary exported from the Freebase model of the vocabulary you can use the URI: If people are interested in using Freebase to model vocabularies I'll be working on these services at the Yahoo! vocamp next week. We will probably also talk these types of services at Code Camp on Sunday and during our SemTech Tutorial Monday morning. J On Jun 11, 2009, at 11:37 AM, Kingsley Idehen wrote: uYou beat me to that question. Is there an rdfs/owl version of the freebase ontology? Juan Sequeda www.juansequeda.com On Jun 11, 2009, at 2:37 PM, Kingsley Idehen < > wrote:" "Fact Extraction from Wikipedia Text datasets released" "u[Begging pardon if you read this multiple times] The Italian DBpedia chapter, on behalf of the whole DBpedia Association, is thrilled to announce the release of new datasets extracted from Wikipedia text. This is the outcome of an outstanding Google Summer of Code 2015 project, which implements NLP techniques to acquire structured facts from a textual corpus. The approach has been tested on the soccer use case, with the Italian Wikipedia as input. The datasets are publicly available at: and loaded into the SPARQL endpoint at: You can check out this article for more details: If you feel adventurous, you can fork the codebase at: Get in touch with Marco at for everything else. Best regards, Marco Fossati uOn 9/2/15 2:33 PM, Marco Fossati wrote: Awesome !" "Pagelinks missing from the triple store atdbpedia.org?" "uHi Jiri, Thank you :) We did not load the pagelinks in our triple store, because it would flood our linked data browsers. The resource i.e. 1000 triples. Kingsley et al: Is it possible to load the triples in our store but only get them when they are explicitly queried? So to exclude them from \"Select *\" queries? May I ask what you have in mind? I could prepare a dump for you with pagelink-counts, i.e. the count of incoming or outgoing pagelinks per resource. Cheers, Georgi uThe reason I was asking is that the structured wiki data (categories & infoboxes) contains usefull information, but not all. I was thinking about \"show me all cars that were in James Bond movies\" sort of queries. The relationship of a car and movie is not in an infobox, but certainly is in the text (a link); as in here: OTH, I understand the flooding problem uHi Jiri, you are right about your suggestion to solve the flooding problem on the client side, but at present we only have generic linked data browsers in place that can't do such filtering. Another reason for not loading into our triplestore was that these pagelinks are quiet poor in semantics, because they only contain the information that two resources are \"somehow\" related, but not \"how\". But the solution might be modeling the pagelinks in a slightly different way: creating a subnode like \"pagelink_collection\" that contains all pagelinks. That way our browsers are not flooded and others can query for the pagelinks. Richard: What do you think? Jiri, I'll get back to you on that tomorrow. Cheers Georgi" "Infobox_properties and Influences" "uHi everybody, thanks again for helping me out with my last question! My master thesis is making great progress. While working with the mappingbased- and infobox_properties files I've encountered a new issue I was wondering about. Some of the entries I find in the property files don't really seem to match with the actual wikipedia articles. For example I found that according to the infobox_properties file writer Alexandru_Macedonski influenced many other writers, but I can't find any of this information on the article on Macedonski in the english wikipedia, neither in the infobox nor in the text itself. Does anyone know why? thanks again, Christoph uHi Christoph, this is interesting. The information does not show on the web page, but if you look at the Wiki markup source code of the corresponding Wikipedia page, the information is contained in the infobox: The DBpedia extraction code uses that source code (and not the rendered HTML), so it gets access to that information. For some reason, it is not rendered to HTML, but that seems to be a Wikipedia issue, not a DBpedia one. Hth. Cheers, Heiko Am 22.01.2015 um 10:47 schrieb Christoph Hube: uHi Christoph, Actually, if you go to the revision from which data for Alexandru Macedonski was extracted for DBpedia 2014 and check the source of the page [1], you find all the names in the two long lists in the infobox - \"influenced\" and \"influences\". Though I have no idea why these names are not visible on the page, and why they were deleted in the current version. Cheers, Volha [1] On 1/22/2015 10:47 AM, Christoph Hube wrote:" "Multiple template mappings" "uI'm trying to produce a template mapping to extract Football Player information from team pages and season pages. I have a start to this at: The result I'm going for is (in Turtle) dbpedia-owl:organisationMember   [     dbpedia-owl:currentMember ;     dbpedia-owl:squadNumber 9;     dbpedia-owl:position \"FW\"   ],   [     dbpedia-owl:currentMember ;     dbpedia-owl:squadNumber 17;     dbpedia-owl:position \"MF\"   ]. However, the TemplateMapping information states: The first template infobox on a page defines the type of this page, while further infobox templates will be extracted as instances of the corresponding types and own URIs. The template Infobox Automobile shall be mapped to the ontology class Automobile, while the template Infobox Automobile engine shall be mapped to the ontology class AutomobileEngine. The correspondence of the infobox template instances can be preserved by defining a correspondingClass and correspondingProperty on the Automobile instance pointing to the AutomobileEngine instance. Which means that I can't just use a TemplateMapping that contains an IntermediateNodeMapping (which was what I was trying to do, until I read the documentation). I can use TemplateMapping's correspondingClass/Property, but that inverts everything: [     dbpedia-owl:currentMember ;     dbpedia-owl:squadNumber 9;     dbpedia-owl:position \"FW\"; dbpedia-owl:team   ], [     dbpedia-owl:currentMember ;     dbpedia-owl:squadNumber 17;     dbpedia-owl:position \"MF\"; dbpedia-owl:team   ]. To me, the more correct way of doing this is that a SoccerClub has players, which is why I would prefer the first example. Is there any way to achieve this using the current mapping templates? Or is there anyway to tell the parsing engine that I'd like to parse a secondary template as a primary template? Bryan" "DBPedia IRI bug?" "uHello! I am trying to get some RDF/XML out of that URI: However, curl -L -H \"Accept: application/rdf+xml\" document And same thing for How can I get the RDF/XML for that resource? Cheers, y uYves Raimond wrote: try to double encode the last part of the URI: curl -L -H \"Accept: application/rdf+xml\" uThanks Davide :-) However, that worried me a little: you have to escape the \"%\"? So any generic client wanting to ingest things from DBpedia needs to have custom logic to handle that double-encoding? Cheers, y uYves Raimond wrote: Yes, that's my fear. In fact, some months ago I was doing a massive ingestion from dbpedia and my business logic wasn't provide this kind of custom logic. The result was that a lot of dbpedia resources was not ingested. Cheers uWhy is double encoding necessary? This sounds like a bug with the server. Cheers, Peter 2009/10/1 Davide Palmisano < >: uLooking at the headers that are returned it definitely looks like a server bug: uDavide Palmisano wrote: That's why, when I do generic database work (either dbpedia or freebase) I like to download the whole thing. If you're doing a little bit at a time, integrity problems have a way of going unnoticed. For instance, when I loaded all of dbpedia into a non-RDF system that was case insensitive for titles, it was clear that wikipedia has 10,000 or so entry pairs like this: here you've got two similar but different items that differ in title only by case. In general you want a human-friendly search engine or search completion facility to be case insensitive, \"alan alda\" should turn up but you also need something that can correctly handle labels/urls that vary only by case without breaking. Had I used a standard RDF stack for my work with dbpedia, I would have blasted right by many integrity issues that the system I built forced me to confront." ":BaseKB EA 2 and basekb-tools now available" "uWe're proud to announce the immediate availability of :BaseKB Early Access 2 The EA2 release fixes a number of problems in EA1, most particularly, problems that affected name resolution. EA2 also comes with the initial release of basekb-tools: an open-source package that implements the name resolution behavior of the proprietary MQL query engine for SPARQL. Put simply, basekb-tools takes a readable query like select ?date { graph public:baseKB { basekb:en.united_states basekb:location.dated_location.date_founded ?date . } } and rewrites it it with identifiers that satisfy the unique name assumption select ?date { graph public:baseKB { basekb:m.09c7w0 basekb:m.035qyst ?date . } } This \"grounding\" process is the key innovation that makes :BaseKB the first correct RDFization of Freebase. :BaseKB covers the intersection of Freebase and Wikipedia and, like the Freebase quad dump, is available freely under a CC-BY license. By installing :BaseKB in a triple store and using basekb-tools, you can write queries against Freebase using the industry-standard SPARQL 1.1 language. basekb-tools contains a test suite that confirms the correctness of of the knowledge base and compatibility with your triple store. We believe that this kind of test suite is a practical answer to pressing problems of Proof and Trust in the semantic web. \":BaseKB is an important milestone for both Freebase and the Semantic Web,\" says Ontology2 founder Paul Houle, \":BaseKB opens Freebase to users of SPARQL and other RDF standards. The superior quality of Freebase data solves data quality problems that have, so far, frustrated Linked Data applications.\" Ontology2 founder Paul Houle will be speaking at the SemTechBiz converence in San Francisco on June 7. His talk, \"Linked Data, Free Pictures, and Markets for Semantic Data\" will cover Ookaboo, :BaseKB and how they fit into the rapidly emerging semantic-social ecosystem. the immediate availability of :BaseKB Early Access 2 particularly, problems that affected name resolution. EA2 also comes with the initial release of basekb-tools: behavior of the proprietary MQL query engine for SPARQL. Put simply, basekb-tools takes a readable query like select ?date { graph public:baseKB { basekb:en.united_states basekb:location.dated_location.date_founded ?date . } } and rewrites it it with identifiers that satisfy the unique name assumption select ?date { graph public:baseKB { basekb:m.09c7w0 basekb:m.035qyst ?date . } } This \"grounding\" process is the key innovation that makes :BaseKB the first correct RDFization of Freebase. :BaseKB covers the intersection of Freebase and Wikipedia and, like the Freebase quad dump, is available freely under a CC-BY license. By installing :BaseKB in a triple store and using basekb-tools, you can write queries against Freebase using the industry-standard SPARQL 1.1 language. basekb-tools contains a test suite that confirms the correctness of of the knowledge base and compatibility with your triple store. We believe that this kind of test suite is a practical answer to pressing problems of Proof and Trust in the semantic web. \":BaseKB is an important milestone for both Freebase and the Semantic Web,\" says Ontology2 founder Paul Houle, \":BaseKB opens Freebase to users of SPARQL and other RDF standards. The superior quality of Freebase data solves data quality problems that have, so far, frustrated Linked Data applications.\" Ontology2 founder Paul Houle will be speaking at the SemTechBiz converence in San Francisco on June 7. His talk, \"Linked Data, Free Pictures, and Markets for Semantic Data\" will cover Ookaboo, :BaseKB and how they fit into the rapidly emerging semantic-social ecosystem. sessionPop.cfm?confid=65&proposalid=4594 uHi Paul and collegues, good new. I'm happy to experience your system. Keep in touch. cristian Il 5/29/12 8:53 PM, Paul A. Houle ha scritto:" "Class syntax: underscores vs. camel case" "uOn 23 Jun 2007, at 13:35, Rich Knopman wrote: Don't generate display labels from URIs. As Tom says, that's what rdfs:label is for. The RDF descriptions that can be retrieved by dereferencing the class URIs will include proper rdfs:labels generated from the WordNet IDs by substituting underscores with spaces, and whatever else is necessary. Richard uHi Rich, On 24/06/07, Rich Knopman < > wrote: No forgiveness required :) I think we're all feeling our way with this and still learning; I know I am for sure! Hopefully as a list we can fill in all the gaps :D That's really useful, thanks. As far as I can tell these didn't crop up yet in the instances file we processed on Friday and I hadn't had a sufficiently detailed look at Mike xls file to spot them. It's very useful to have these in mind before we produce any SKOS or RDFS hierarchy based on the YAGO data. I guess the important thing is that Georgi's script for generating the rdf:type triples and our script for producing the RDFS or SKOS hierarchy just follow the same rules for handling apostrophes, hyphens etc. Very useful to bear this stuff in mind from day one though, thanks! Cheers :) Tom." "internal resource uri in german json data" "uhi, the following url returns json data containing internal url's which are not accessible. uHi Ulrich, Thanks for the observation, I didn't notice this issue. There is a bug in the DBpedia-Vad that rewrites the URIs in some cases although it shouldn't . You can use the output of the SPARQL endpoint directly and you won't get this bug [1]. Also the links in the page for Microdata/Json and JSON-LD are correct, just the things that the VAD caches under /data are wrong. I'll look into how I can fix this issue. Cheers, Alexandru [1] On Mon, Oct 6, 2014 at 2:47 PM, Ulrich Leodolter < > wrote:" "DNS Problem or Server down! Server move completed" "uHi Zdravko and Richard, something must have gone wrong with the DNS switch. as well as pubby, for instance What is still there is the Wiki at Any ideas? Cheers Chris" "flickrwrappr: Resource Not Found" "uChristian, With the new flickrwrappr links, I see a lot of “Resource Not Found” messages when clicking p:hasPhotoCollection for resources a bit off the mainstream. I suggest changing the 404 message to something like “No photos related to this DBpedia resource found”. Then it appears less like something is broken, and people are more likely to say: “Well of course there are no photos for ‘Social Darwinism in European and American Thought 1860-1945’.” Cheers, Richard uRichard, thanks for the suggestion - I added some nicer screens that better explain what's happening: pean_and_American_Thought_1860-1945 Also, the sourcecode for the flickr wrappr is now available at r. Cheers, Christian uOn 14 Sep 2007, at 18:56, Christian Becker wrote: Great! Now flickr wrappr wins the price for the best 404 pages on the entire Semantic Web ;-D Many thanks. Richard" "etymology extractor" "uHello everyone, I am trying to use the Wiktionary extraction framework to extract the etymology with the wikitext (i.e., including templates like {{inherited|en|enm|dore}}). However when I run wiktionary-4.1-SNAPSHOT-jar-with-dependencies.jar etymology is skipped (the etymology \"extractor template\") is not used and therefore output triples do not contain etymology. Which file should I modify to print out etymologies? How can I make sure the output contains unparsed wikitext? Thanks a lot, Ester Hello everyone, I am trying to use the Wiktionary extraction framework to extract the etymology with the wikitext (i.e., including templates like {{inherited|en|enm|dore}}). However when I run wiktionary-4.1-SNAPSHOT-jar-with-dependencies.jar etymology is skipped (the etymology 'extractor template') is not used and therefore output triples do not contain etymology. Which file should I modify to print out etymologies? How can I make sure the output contains unparsed wikitext? Thanks a lot, Ester uDear Ester, The main developer of the wiktionary module (Jonas Brekle) has left a couple of years ago and since then we do only basic maintenance on wiktionary (i.e. make sure the code compiles). The code he developed was quite flexible and configurable but at the moment we cannot support such user questions. If someone from the community (or you) would like to pick up Jona's work we'l be more than happy to give you all the help we can we are also exploring the possibility to collaborate with dbnary for wiktionary extractions. Best, Dimtiris On Fri, Feb 26, 2016 at 9:59 AM, Ester Pantaleo < > wrote: uThanks Dimitris for your answer. I just wrote to Jonas and he said that something is broken in the code, that's why when I run the wiktionary-4.1-SNAPSHOT-jar-with-dependencies.jar I only get as output a few triples (only these predicates ) My questions is how I can choose the predicates and print out for example etymologies, synonyms etc? I would like to work on this although at the moment I don't have any funding. I am trying to get funding from Wikimedia to develop an etymology extraction and visualization tool (here is a link to the draft of the proposal: ). I implemented a demo of the etymology visualization tool here The aim of the application is to visualize - in one graph - the etymology of all words deriving from the same ancestor. Users can expand/collapse the tree to visualize what they are interested in. The textual part attached to the graph can be easily translated in any language and the app would become a multilingual resource. My idea is to use Dbpedia's extraction-framework (for Wiktionary) and develop a (possibly) smart pre-processing strategy to translate Wiktionary textual etymology into a graph database of etymological relationships. Anyone interested in the project (collaborating/funding/helping in any way) please contact me. Best, Ester On Fri, Feb 26, 2016 at 10:57 AM, Dimitris Kontokostas < > wrote: uIf there is a bug, it is a pity that the whole Wiktionary extractor cannot be used. Its flexibility and configurability (described in smart) project. Hopefully someone will like to contribute and debug the code! Ester On Fri, Feb 26, 2016 at 11:41 AM, Ester Pantaleo < > wrote:" "sparql endpoint does not works?" "uHi All, When use function getSparqlClient() of RAP against dbpedia endpoint ( unable to parse result server returned: Error HTTP/1.1 404 File not found The requested URL was not found URI = '/sparql' However, it works fine for Have you ever face the same problem? Thanks and regards, Hendrik. Hi All, When use function getSparqlClient() of RAP against dbpedia endpoint ( Hendrik. uHi Hendrik, The LOD Cloud cache ( The DBpedia sparql endpoint is online however and as it uses the same Virtuoso v6 version as LOD configured pretty much the same, so if you could access the LOD sparql endpoint with the getSparlClient() function I don't see why you would not be able to access the DBpedia one ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 24 Dec 2009, at 09:12, Hendrik wrote:" "Geo data for German Wikipedia missing?" "uDear all, I am searching the geo coordinates of the German Wikipedia articles in DBPedia. However, the corresponding geo_coordinates_en_uris_de* files seem to be (almost) empty, e.g., is only 711 bytes in size. Searching on the web I found the following discussion on this list which suggests that I should have a look at the mappingbased_properties_de* files as well. In doing so, I find for Hannover the two lines which supposedly should contain the coordinates of Hannover but they don't. (On Hannover are 52° 22? N, 9° 44? O). Does someone has an explanation or even better can tell me where I can get the geo locations for the articles? Thanks, Robert uHi Robert, this is caused by the missing geocoordinates template for the German Wikipedia inside of the DBpedia extraction framework. We fixed this already and will update the files online asap. Cheers, Anja On Sep 25, 2012, at 15:51, Robert Jäschke < > wrote: u uSorry, I forgot to do a \"reply all\" uHi, when searching the list archive I found this thread from two months ago: On Tue, 25 Sep 2012 07:42:54 -0700, Anja Jentzsch wrote: Has the fix been deployed? I'm still seeing near-empty coordinate files on I'm unfamiliar with Wikipedia mass data processing - is there a feasible method for me to extract that file myself from somewhere? I wouldn't necessarily have to be nice RDF. Bye Frederik" "getting DBpedia mappings to work with nested templates" "uI've been trying to get the Chembox mapping to pick up properties such as the CAS number for chemicals, but haven't had much success. One thing I noticed on the template is that it uses nested templates for different sections of the infobox, meaning that within the Chembox template there is a call to the Chembox Identifiers template which actually contains the CAS number. Based on this, I've set up a specific mapping for the Chembox Identifiers template, although this still can't find the CAS number. Is there an equivalent example of a solution to this on the DBpedia mappings wiki that I can look at? Is this a situation where a custom parser is required? Thanks, Chris uHi Chris, right now there isn't a solution for these nested templates. We have similar problems with the infobox ship, infobox aircraft and the infobox animanga. But the issue is on our list. regards, Paul" "DBPedia Live Configuration" "uHi, I'm a new maintainer for the German language DBPedia and I'm currently trying to install DBPedia Live. I've configured everything including the Abstract Extractor and the dump extraction works, but I can't get the Live Extraction to run. Since I've found no readme file, I've tried the same approach as with the dump extractor, namely scala:run (bad idea since the main class is java). Then I've tried running the java Main class in org.dbpedia.extraction.live.main.Main with maven:exec, however that only seems to write the statistics in publishdata/instancesstats.txt and nothing more. If there are some installation instructions cold you point me to them. If not, could you please tell me how to run the Live Extraction. Kind Regards, Alexandru Hi, I'm a new maintainer for the German language DBPedia and I'm currently trying to install DBPedia Live. I've configured everything including the Abstract Extractor and the dump extraction works, but I can't get the Live Extraction to run. Since I've found no readme file, I've tried the same approach as with the dump extractor, namely scala:run (bad idea since the main class is java). Then I've tried running the java Main class in org.dbpedia.extraction.live.main.Main with maven:exec, however that only seems to write the statistics in publishdata/instancesstats.txt and nothing more. If there are some installation instructions cold you  point me to them. If not, could you please tell me how to run the Live Extraction. Kind Regards, Alexandru uHi Alexandru, it's so nice to have a German DBpedia-Live working ;). But, in order to get DBpedia-Live working you should execute that command \"mvn scala:run\". Moreover and this is the most important point, in order to get DBpedia-Live running you should have access to the Live-Update stream of Wikipedia. In order to get access to such a stream you should have a username and password with suitable privileges. This can only be done by contacting Wikipedia maintainers in order to get your login credentials. So you should contact them before getting everything ready for work. If you have any further, don't hesitate to contact me. uCool! Is there some language-specific magic that needs to be done? Are Max's improvements on the configurability going to show up in the live branch as well? I'd like to try a live DBpedia Portuguese if my colleagues from Brazil would be able to help. Also, any help on how to obtain the user-power of hooking to the Wikipedia update stream would be great. A quick Web search with the obvious combinations of {wikimedia, wikipedia, update, stream, user, privilege} didn't help. Cheers, Pablo On Wed, Jun 29, 2011 at 5:31 PM, Mohamed Morsey < > wrote: uHi, On 06/29/2011 05:57 PM, Pablo Mendes wrote: I don't think that it will need much effort to get DBpedia-Live working for other languages. For Configurability, I should check that modification first, and see how it affects the whole extraction process and when I make sure that everything is working properly, definitely we will adopt those changes. Nice to have a Portuguese version too ;). Sorry to mention, that you should get access to Live-Update stream in order to be able to get a continuous stream of changes from Wikipedia :(. uHi Mohamed, But, in order to get DBpedia-Live working you should execute that command \"mvn scala:run\". Where exactly should I execute the scala:run command. Running it in the extraction_framework directory just tries to run the \"server\" and running it in the \"live\" directory just gives me a \"[WARNING] Not mainClass or valid launcher found/define\" . Moreover and this is the most important point, in order to get DBpedia-Live running you should have access to the Live-Update stream of Wikipedia. In order to get access to such a stream you should have a username and password with suitable privileges. This can only be done by contacting Wikipedia maintainers in order to get your login credentials. So you should contact them before getting everything ready for work. Can you give me a link to a contact page or an e-mail address where I can contact the maintainers ? Furthermore, in which config file to I set the user and password for the Live Update Stream ? I've looked in dbpedia_default.ini , but I only see config options for Virtuoso and Mysql (and postgress) but nothing for an update stream. Looking trough the code I've found String oaiUri = \" authentication for those. Kind Regards, Alexandru On Wed, Jun 29, 2011 at 5:31 PM, Mohamed Morsey < > wrote: uHi Mohammed, order to be able to get a continuous stream of changes from Wikipedia :(. Sorry for poorly formulating my question. What I meant was: where/how/from whom do I get \"access to the Live-Update stream\"? I searched a bit and got nowhere. Can you give us a link or an e-mail address perhaps? Thanks, Pablo On Jun 29, 2011 6:28 PM, \"Mohamed Morsey\" < > wrote: for other languages. it affects the whole extraction process and when I make sure that everything is working properly, definitely we will adopt those changes. order to be able to get a continuous stream of changes from Wikipedia :(. of uOn 6/30/2011 9:56 AM, Pablo Mendes wrote: uHi Alexandru, First of all sorry for my belated reply. On 06/29/2011 07:38 PM, Alexandru Todor wrote: You should execute it from \"live\" folder. At first, you told me that it works but it only wrote some statistics, which means that it was working. Anyway try to the following fragment to pom.xml file of live module, and tell me if it works or not. org.scala-tools maven-scala-plugin uHi Pablo, On 06/30/2011 09:56 AM, Pablo Mendes wrote: I'll contact you later if we can find a way to get a live update stream for Portuguese language :)" "Bad Wikipedia abstracts" "uDoes anyone know if DBPedia have plans to come with improvements to their logic for extracting abstracts? The current algorithm fails on so many easy cases that it affects it's usefulness. Take this example: \"} |- | |- | |}\"@en . The actual article is I could live with if a very few articles get misparsed and contain junk like that, but it simple fails on very easy examples like \"Volvo\": \":This article is about Volvo Group - AB Volvo; Volvo Cars is the luxury car maker owned by Ford Motor Company, using the Volvo Trademark.\" Looking at the actual page ( makes it easy to see that what it should have extracted is: \"Volvo Cars, or Volvo Personvagnar, is a Swedish automobile maker founded in 1927 in the city of Gothenburg in Sweden.\" There are over 10000 articles that have got extracted like that: Another 2000 articles contain \"redirect messages such as\": \":\\"F1 1995\\" redirects here. For the video games based on the 1995 Formula One season, see F1 95.|}\"@en . Looking at the article it's easy to see that a better sentence to fetch is \"The 1995 Formula One season was the 46th FIA Formula One World Championship season\". I don't think it should be too complicated to avoid getting junk by just looking at strings such as \"redirects here\" or \"This article is about\". Does someone know if someone is working on improving on this for the next dump they gonna create. Or has someone written another better parser already that I can use (or just download the dump of what it has generated). uHi Omid, you are right, the abstracts' quality needs improvement. We have plans to have better abstracts :) But nobody is working on it at the moment or will do in the very near future. As you might know, the DBpedia extraction framework is open source, and we highly welcome contributions! You can find the Abstract extractor at ShortAbstractExtractor.php?revision=468&view;=markup If you have any question regarding the framework, please feel free to send me a message. Cheers, Georgi uI'm willing to help contributing with code to improve this feature. However I have some issues getting started running the code base. Is there any tutorial that shows step by step how one can get the project running on my own machine? I have seen with \"svn co dbpedia\". I wonder why do I need to specify a database in \"./databaseconfig.php.dist\"? What will this database be used for? Correct me if I'm wrong, but I thought what the program does is to read as input the wikipedia dump (for example \"enwiki-20080103-pages-articles.xml\") and output a lot of .nt files (such as \"articles_abstract_en.nt\" etc). (1) Why does this process involve a MySQL database? (2) As my first project I want to improve on the abstract extractor (dbpedia/extraction/extractors/ShortAbstractExtractor.php). I do not want to generate anything but \"articles_abstract_en.nt\", so I want to disable everything except this particular module. How do I do this? I don't want all other components to run and take time. (3) Is there any documentation for someone brand new to this project that shows with simple steps how one get started with compiling and running this. start.php, but is there any more documentation? For example: Where is the config files? Where do I specify the path to \"enwiki-20080103-pages-articles.xml\"? Where will the output be stored? I guess the answer to these questions are somewhere in the source code, but it would be nice to have a \"get started with DBPedia in 5 minutes\" kind of tutorial showing step by step what one need to do to get it up and running. Thanks /Omid On 5/2/08, Georgi Kobilarov < > wrote: uHi Omid, The DBpedia Scripts wont read the xml Files. All Data from the Wikipedia Dumps should be loaded in a MySQL Database first. You can do this using the import.php Script in /importwiki. For en Wikipedia you may start this with eg.: php import.php -c -d DOWNLOADPATH -ip 127.0.0.1 en DBHOST DBNAME DBUSER DBPASSWORD -ip is your machine ip (helps if mwdumper throws exceptions as i remember) -you can find these parameters calling php import.php in /importwiki Since this is done youll have your Wikipedia Database on your Machine and you can start your first extraction. (The import script downloads, unzips and writes the Dump to Database) - will need some disk space ;) First you have to rename the databaseconfig.php.dist to databaseconfig.php and put your Database Parameters in this file. For starting an extraction I used the start.php just comment out the extractors you wont need. Extracting all Datasets should be done by extract.php (I just copied out the Code for Shortabstracts on End of this Mail) So i hope this helps. It has been a while since i used the Framework so it could be, that i forgot anything. Just let me know if its not running. Jörg start.php: function autoload($class_name) { if(preg_match('~^.*Extractor.*$~',$class_name)) require_once ('extractors/'.$class_name.'.php'); else if(preg_match('~^.*Destination.*$~',$class_name)) require_once ('destinations/'.$class_name.'.php'); else require_once $class_name . '.php'; } $pageTitles = array(\"Google\"); //will extract the Google Article - for all articles see original start.php //Create a Extraction Job $job = new ExtractionJob( new DatabaseWikipedia(\"en\"), $pageTitles); // Create ExtractionGroups for each Extractors $groupShortAbstracts = new ExtractionGroup(new SimpleDumpDestination()); //SimpleDumpDestination will Output to Screen $groupShortAbstracts->addExtractor(new ShortAbstractExtractor()); $job->addExtractionGroup($groupShortAbstracts); //Execute the Extraction Job $manager = new ExtractionManager(); $manager->execute($job); uHi Omid, a little addition to Jörg's previous message: you don't have to import the Wikipedia dumps into a mysql server for development. Use the LiveWikipedia Connector instead of the DatabaseWikipedia connector (line 114, start.php). Then you can specify a small bunch of Wikipedia articles to test and develop with (line 104, start.php), which will be downloaded using a Wikipedia webservice at runtime. But don't overstress the webservice, you're IP might get blocked. one request per second is ok. You'll need the mysql database as soon as you want to create a whole ntriples file with all dbpedia resources. If you don't want to do that, we can use your improved extraction code with our next dataset release. If you have any question, please don't hesitate to ask. Best, Georgi" "Removing LingPipeSpotter spotter" "uHello, I would like to remove the lingPipeSpotter since Im working on a machine which cannot load the whole model into ram. The problem is I get the following error: it seems that the server still looks for some settings of the lingpipespotter even tho I did not select it. this is my server.properties spotters conf: org.dbpedia.spotlight.spot.spotters = CoOccurrenceBasedSelector any hint? Thank you in advance. Hello, I would like to remove the lingPipeSpotter since Im working on a machine which cannot load the whole model into ram. The problem is I get the following error: advance." "Regarding using Article Text (besides templates and infoboxes) for extracting triples for DBpedia" "uHi everyone, I am a PhD student and plan on using the DBpedia dataset for my research, I have read the relevant papers on the DBpedia website, however, I am still confused about whether DBpedia uses article content for extracting triples or not, I understand that DBpedia uses info-boxes and article templates from Wikipedia, but I need to know whether it extracts any infromation from the textual content of the articles or if it extracts any relations from the text of articles besides the templates. I would really appreciate if someone could clarify this to me. Thank you. Best Wishes, Zareen. uHello, zareen syed wrote: DBpedia does not extract information from the textual context of articles in the sense that it tries to understand the semantics of the text. It extracts e.g. links between articles and uses the first part of the text as abstracts for the article etc., but does not use NLP parsers (yet). Kind regards, Jens" "How many Wikipedia articles covered by DBpedia?" "uHi, are there any information out there about how many articles in Wikipedia have infobox templates and other kinds of structured information that got extracted by DBpedia? I know there is an DBpedia URI for every article in the English Wikipedia, but some contain more information (like Kind regards, Benedikt -" "FILTER NOT EXISTS error on sparql" "useems stems from my use of FILTER NOT EXISTS. But this query worked a when I last ran it about 38 hours ago. Below is the error message I get, which also shows the query. Has the server software changed? uOn 3/9/13 2:28 PM, Tim Finin wrote: uMeanwhile your query works using this clunky statement: optional {?S dbpo:numberOfUndergraduateStudents ?ANY} filter (!bound(?ANY)) Benjamin Am 09.03.2013, 21:25 Uhr, schrieb Kingsley Idehen < >: uOn 3/11/13 9:20 AM, Benjamin Großmann wrote: Yes, but Tim's effort is bound to standard SPARQL and I want to keep it that way :-) We are fixing this regression. Note, that you don't have this problem on: Kingsley uHmmm that should also be standard SPARQL 1 and 1.1 Once I learned this pattern from that reference's bound example (although it's not so straight forward like the new \"filter not exists\"). Am 11.03.2013, 14:39 Uhr, schrieb Kingsley Idehen < >: uOn 3/11/13 10:18 AM, Benjamin Großmann wrote: Correct. The key point is that we have a bug to fix. Tim shouldn't have to refactor his existing app :-) Kingsley uOn 3/9/13 2:28 PM, Tim Finin wrote:" "receiving lots of sparql server 404 errors" "uI am query the sparql endpoint for the last couple of Weeks and getting quite a few 404 failed to connect to remote server errors Why is this? uHi John, The DBpedia server went down for maintenance this afternoon, with the server having been updated. Do let us know if you still have issues accessing the server over the coming days Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: Twitter: On 7 Oct 2009, at 18:52, John Abjanic wrote: uHugh Williams wrote: Remember, it does also depend on what he is doing in conjunction with the rest of the world. Rate limits are in effect, even outside the Virtuoso instance based on increased traffic across many fronts (most trying to slurp the entire database in predictable fashion). Kingsley uWe¹re having issues now: 42001 Error SR185: Undefined procedure SIMILE.DBA.cl_rdf_inf_init. in SPARQL query: define sql:signal-void-variables 1 define input:default-graph-uri select distinct ?Concept where {[] a ?Concept} Thanks, John On 7/10/09 20:23, \"Hugh Williams\" < > wrote:" "What could you do with free pictures of everything on Earth?" "uOur latest project, aims to drastically improve image information retrieval by indexing images by linked data terms. Although I've got reservations about owl:sameAs, ookaboo can be simply linked to to dbpedia by loading the following NT file, we're adding content to this site very rapidly, so this mapping file is automatically updated every 24 hours. What can I do to improve the discoverability of this file so that people can find this file and use it? I ~very~ badly want people to start using this service, both the RDFa metadata and the JSON API, so I'll work closely with the first few consumers of the API to make things work." "Question about query" "uI see we can search for items like ?film, where can I find the list of items that can be searched and their properties? I'm trying to look for people and built a query to run a free text search on label which work but take too long, I want to limit to people but have no idea how :( Any suggestion? See how Windows Mobile brings your life together—at home, work, or on the go. { margin:0px; padding:0px } body.hmmessage { FONT-SIZE: 10pt; FONT-FAMILY:Tahoma } I see we can search for items like ?film, where can I find the list of items that can be searched and their properties? I'm trying to look for people and built a query to run a free text search on label which work but take too long, I want to limit to people but have no idea how :( Any suggestion? See how Windows Mobile brings your life together—at home, work, or on the go. See Now" "Compiling extraction_framework problems" "uI'm trying to build the extraction_framework project but running into a few issues. When i tried to build using maven 3.0.3, i got a warning and a error: [WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ org.dbpedia.extraction:main:2.0-SNAPSHOT, /Users/tcc/src/scala/dbpedia_extraction_framework/pom.xml, line 73, column 21 [WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-surefire-plugin is missing. @ org.dbpedia.extraction:main:2.0-SNAPSHOT, /Users/tcc/src/scala/dbpedia_extraction_framework/pom.xml, line 81, column 21 [ERROR] Failed to execute goal on project core: Could not resolve dependencies for project org.dbpedia.extraction:core:jar:2.0-SNAPSHOT: The following artifacts could not be resolved: org.json:json:jar:1.1, org.openrdf:openrdf:jar:1.0: Failure to find org.json:json:jar:1.1 in Investigating the error, I noticed that the json and rdf are referring to packages which are not in public repos? org.json json 1.1 jar org.openrdf openrdf 1.0 The closest json lib i found was one version 20090211 ( I couldn't find a openrdf artifact, i did only notice these ones, none of which are named just openrdf. Which repo should I add to successfully build the extraction_framework? Thanks for helping me get started with developing with dbpedia! uI believe these libraries are in the AKSW repo. Also, I think the project uses Maven2. So it's probably a good idea to confirm these two suspicions via the documentation for developers in wiki.dbpedia.org Cheers Pablo On Jul 10, 2011 3:13 AM, \"Tommy Chheng\" < > wrote: few issues. org.apache.maven.plugins:maven-compiler-plugin is missing. @ org.dbpedia.extraction:main:2.0-SNAPSHOT, /Users/tcc/src/scala/dbpedia_extraction_framework/pom.xml, line 73, column 21 org.apache.maven.plugins:maven-surefire-plugin is missing. @ org.dbpedia.extraction:main:2.0-SNAPSHOT, /Users/tcc/src/scala/dbpedia_extraction_framework/pom.xml, line 81, column 21 dependencies for project org.dbpedia.extraction:core:jar:2.0-SNAPSHOT: The following artifacts could not be resolved: org.json:json:jar:1.1, org.openrdf:openrdf:jar:1.0: Failure to find org.json:json:jar:1.1 in resolution will not be reattempted until the update interval of maven2-repository.dev.java.net has elapsed or updates are forced -> [Help 1] packages which are not in public repos? > > which are named just openrdf. uHi Tommy, as Pablo mentioned those libraries are in AKSW repository. I've downloaded the project and built it and it compiled with no problem. I guess that there was some connection problem to the repository, so MAVEN couldn't download the required files successfully. So, please try again to rebuild it. On 07/10/2011 09:27 AM, Pablo Mendes wrote: uHi, I can confirm this problem. I've tried it multiple times on different servers and have had the exact same problem. It started since the internet problems in Leipzig appeared. I had to work around it by copying the missing artifacts from a maven repository on a server where I compiled the framework before the Leipzig problems. Mohamed, did you try to compile the framework on a computer where it hasn't been compiled before, or try to delete the .m2 repository directory before the mvn install command ? If you do that I think you'll notice the same problems we did. Kind Regards, Alexandru Todor On 7/10/2011 3:12 AM, Tommy Chheng wrote: uThanks, it was an internet connection problem. I re-compiled and it worked now. Are these packages custom built in some fashion? Curious to know why they are different from the mvnrepository mirror in version number style and package names. uHi, On 07/10/2011 06:07 PM, Alexandru Todor wrote: Yes, that was because our repository was down. I tried it and it works, so please try to get the latest version of the project as we have changed our main repository after that Internet problem, and we have also changed the pom file accordingly. Let me know if compiles or not." "Question" "uHello, To dump and install extraction frame work, we follow all steps (Please check the link, we hope you are familiar with this.) We successfully pass each steps and reach in /clean-install-run extraction extraction.abstracts.properties. This one also run on our Ubuntu terminal successfully. The problem is how we can see or where we can find the RDF files? Please advise your opinions on this issue. We couldn’t know how to get the RDFs. We thank you in advance for your help. Regards Biniyam. G Hello, To dump and install extraction frame work, we follow all steps (Please check the link, we hope you are familiar with this.) G uHi Biniyam, It depends on what you set your *base-dir* to in extraction.abstracts.properties. If you're running extraction on enwiki for example, having timestamp 20140811 (assuming you've set base-dir to /data), you'll find your RDFs for enwiki at /data/enwiki/20140811. I hope you've setup MySQL and your local MediaWiki instance correctly too? Cheers, Nilesh You can also email me at or visit my website On Sun, Aug 24, 2014 at 3:35 PM, Biniyam Getahun < > wrote:" "How can I help?" "uDBPedia, The following query for the number of universities in each country is a bit “ugly”. For example, 1) Countries are identified with URIs or Strings, and the same country is identified in many different ways. 2) The country “20” has three universities :-) I’m not very familiar with the workflow that DBPedia has, but I’m curious what I could do to fix the results as “upstream” as possible. Is there a “how to help” page somewhere that I could read? Or, could someone provide me a few pointers to get started? Thanks for your consideration. Regards, Tim Lebo {{{ prefix dbpedia: prefix dbo: prefix dbp: select ?country count(distinct ?university) as ?count where { ?university dbo:type dbpedia:Public_university optional{?university dbp:country ?country} } group by ?country order by desc(?count) }}} Query results: 1brlE6z uOn Sat, Dec 7, 2013 at 6:52 PM, Timothy Lebo < > wrote: In your query, you're using the raw infobox data. That data is much more noisy than the data in the DBpedia Ontology. If you restrict yourself to the DBpedia ontology, you'll get much more sensible results. E.g., if you execute this query on the DBpedia SPARQL endpoint ( dbpedia-owl: ): select ?country (count(?university) as ?count) where { ?university a dbpedia-owl:University optional{ ?university dbpedia-owl:country ?country} } group by ?country you get much better results. For more about the differences, see this StackOverflow question ( and some of the DBpedia documentation that the answer links to. Happy SPARQLing! //JT uOn Sat, Dec 7, 2013 at 11:52 PM, Joshua TAYLOR < > wrote: This is a common enough issue that I've described it a number of times on StackOverflow. Here are two other cases where the noise in the raw infobox data affected people's results: 1281433 uOn 12/7/13 6:52 PM, Timothy Lebo wrote: Tim, dbp: is a legacy namespace for properties. dbo: is the current namespace for properties. Thus, you would be looking at something like: . uKingsley and Josh, Thanks so much for your responses pointing me to DBO instead of DBP. That certainly helps my current usage scenario. Also thanks to your pointers, I found a description of dbp vs dbo: I was also looking to contribute to the info box parsers, which I think the best place to start is: Is that true, or are there other good starting points? Thanks, Tim On Dec 8, 2013, at 3:41 PM, Kingsley Idehen < > wrote:" "musicbrainz.org sameAs statements" "uHello, A short while ago there was a discussion on the LOD list about what are the best URIs to use for music artists / tracks / releases [1]. There seemed to be a consensus that the 'original' musicbrainz.org URIs should be used even tho the dereferencing is not really there right now (a solution is said to be coming soon). My question is when might we see this in DBpedia? Kidehen suggested musicbrainz.org URIs would replace (or augment) the zitgist ones. Is there anything I can do to help? -Kurt J uKurt J wrote: Kurt, We could make another MBZ dump. This is basically what Zigist URIs are, but they were generated around the onset of LOD. More recently, we made another dump which was then incorporated into both the DBpedia and BBC data spaces. Also note, that we have a Sponger Cartridge for MBZ which allows you to progressively generate Linked Data from this data space also. Ideal situation would be for MBZ project folks to work with us re. Linked Data publishing directly from the MBZ domain. If that doesn't work, then you could work with us re. timely production of new dumps, or an instance that simply does the same thing on a progressive basis via the Sponger Middleware. Kingsley" "question regarding retrieval of news organisations" "uHello, I've been lurking on this list for a while, not being sure whether it's a forum intended for 'admin' or 'developer' type discussions. Over time, it seemed like it's a bit of both which is why I'm going to ask my newbie/developer question (I manage my way around SPARQL, and know a bit about dbpedia.org, mainly through this list). I'd like to look up information in dbpedia.org that's related to news sources (websites, newspapers, media organisations, blogs, etc.). My main goal is to retrieve some information about the news organisation, a bit like the Google info box (name, summary abstract, place of where it is geographically located, country of origin, date of creation, ). I understand that there certainly isn't always all the information I'm looking for. All I have at my disposal are the domain names of URLs retrieved via RSS feeds (i.e. I have no control over them whatsoever): and many hundreds, probably thousands of others. My application will take each one and send a SPARQL query to dbpedia using this 'identifier' as its initial and only data point. I cannot have specific queries for specific URLs. In theory, they seem like useful identifiers, and often enough, when I check with the English wikipedia, there is an equivalent website link in the info box, which if I'm not mistaken is the part of a wikipedia page that is getting mapped to dbpedia. However, I'm not sure to which properties these website links are mapped. Looking at dbpedia-owl:wikiPageExternalLink, dbpprop:website, foaf:homepage which all have the website's link. Is there a canonical one to use? Is there any guarantee that if there is website link in an infobox on wikipedia (let's assume I restrict myself to the English one)? Also, looking at some of the dbpedia pages (e.g. dbprop:headquarters (contains a string) and dbpedia-owl:headquarter (contains a resource which is much more useful) from where I can then retrieve information about the city and the country. Is there a recommended way of querying dbpedia using a certain set of properties rather than others? Or should one just create a lot of OPTIONAL clauses and hope to catch at least one of them? I have the impression that one cannot necessarily trust that certain information will be in dbpedia.org (not a problem, it obviously being a volunteer effort). I'm hoping, in this proof of concept, to use some sparql queries and dbpedia.org as it appears to have the richest information, but if there are other linked data endpoints that contain more/better information on news organisations across the world, I'd be very interested, if that's not beyond the remit of this discussion group. Thank you very much in advance for any help or pointers you may be able to provide. cheers, Jakob. PS: I have just discovered this page http://mappings.dbpedia.org/index.php/Mapping_Guide which might hold some answers to my questions, I guess. PPS: And here's a first throw at a query that retrieves a couple of items I'd be interested in: https://gist.github.com/jfix/cfe167ab1d5f6a84385d uHi Jacob, thanks for that interesting question! I'm copying FP7 Multisensor, since that may be of interest. Jacob's question (and ensuing discussion) can be found on: - - Not the question you asked, but: Check \"rdf:type yago:SOMETHING\" for some of the news orgs, and chances are you'll find a type that identifies most of them. - wikiPageExternalLink is ANY link mentioned in the article. If it's a host name without any path, then chances are that's the home page of the subject, but no guarantee. - foaf:homepage must be the home page of the subject, though someone may have made a mistake in wikipedia or in a mapping - also check foaf:page. It's \"any\" page related to the subject. It's a super-prop of foaf:homepage, but I'm not sure that reasoning is effected - dbpprop:website is a raw property that may be used in some wikipedia templates but not others. I think the best is to check all these methods over two sets: - news orgs identified by yago:SOMETHING - news orgs in your list of home pages dbprop is a raw property, dbpedia-owl is a mapped property. - You can find all uses of that mapped property (i.e. all maps that use it) here: - You can find other ways to explore the ontology and mapping here: - Unfortunately there's no way to restrict to English mappings only. All variations of \"mapping en\" \"ontologyproperty headquarter\" failed. Anyway, the map you want is here (but there's also \"television channel\" etc): - I checked a few wikipedia instances and all use \"headquarters\" consistently - I guess same for website->foaf:homepage Back to your question: \"dbprop:headquarters (contains a string), dbpedia-owl:headquarter (contains a resource which is much more useful)\". - yes, dbo:headquarter gets the links (because it's an ObjectProperty), dbp:headquarters gets any text - E.g. see the split in dbp:headquarters 132 (xsd:integer); 93213 (xsd:integer); Fax : 1 49 22 22 35; Tel : 1 49 22 20 01 dbo:headquarter dbpedia:Saint-Denis - the ObjectProperty extractor takes ANY link, no matter what it signifies. - are there dbo:headquarter that are not Places? Yep, there are some thousands: select * {?x dbpedia-owl:headquarter ?y filter not exists {?y a dbpedia-owl:Place}} - My favorite example is http://dbpedia.org/resource/Asheville_Citizen-Times dbo:headquarter http://dbpedia.org/resource/O._Henry I think that's a fiction newspaper devised by O.Henry ;-) - BTW, for some nefarious reason http://dbpedia.org/resource/Madrid is not a Place :-) so let's refine the query: select * { ?x dbpedia-owl:headquarter ?y filter not exists {?y a dbpedia-owl:Place} filter (?y != dbpedia:Madrid) } You may also want to check what fields are not yet mapped: http://mappings.dbpedia.org/server/templatestatistics/en/?template=Infobox_newspaper - should be interesting: 20 publishing_city, publishing_country - these may also be interesting: ISSN, oclc The description of all template fields is at: https://en.wikipedia.org/wiki/Template:Infobox_newspaper Exactly. Adding mappings is easy, and they appear on live.dbpedia.org shortly afterwards. Try wikidata. - Pick a smaller newspaper (e.g. Chicago Defender) - It's guaranteed to be found since wikidata reflects all of wikipedia: https://www.wikidata.org/wiki/Q961669 - But does it have an appropriate type? Yep, \"instance of=newspaper\" - Do they have the home page? Nope. That's the strength of DBpedia: extracts a lot more props, even though not always perfectly. Counting: - How many newspapers in wikidata? http://vladimiralexiev.github.io/CH-names/README.html#sec-2-1-4 (as of Dec 2014) -> 6187 Also see summary of some other counts, and a gist with all instance counts - How many in wikidata today? Click a query here: http://wdq.wmflabs.org/wdq/ : CLAIM[31:11032] And then run it here: http://tools.wmflabs.org/autolist/autolist1.html?q=CLAIM%5B31%3A11032%5D -> 6275 - How many with webpage? CLAIM[31:11032] AND CLAIM[856] http://tools.wmflabs.org/autolist/autolist1.html?q=CLAIM%5B31%3A11032%5D%20AND%20CLAIM%5B856%5D -> only 646 - How many in dbpedia.org? select count(*) {?x a dbpedia-owl:Newspaper} -> 6043 (as ofhmmAug 2014: pretty close to wikidata) - How many with webpage? select count(*) {?x a dbpedia-owl:Newspaper. filter exists {?x foaf:homepage ?y}} -> 4666. Pretty Good!" "German government proclaims Faceted Wikipedia/DBpedia Search one of the 365 best ideas in Germany" "uHi all, very good news: The German federal government has proclaimed Faceted Wikipedia/DBpedia Search as one of the 365 best ideas in Germany in the context of the “Deutschland – Land der Ideen” competition. The competition showcases innovative ideas in areas such as science and technology, business, education, art and ecology. The patron of the competition is the German President Horst Köhler. For more information about the competition and for test-driving Faceted Wikipedia/DBpedia Search please refer to edia-search-one-of-the-365-best-ideas-in-germany/ Have a nice weekend, Chris" "Get a specific row given its rank from an ordered result in SPARQL" "uHi, Is it possible to do ranking in SPARQL, to get  a row given its rank from an ordered result. For example for the first row  we can use ORDER BY  DESC(?property)  and using the LIMIT clause as LIMIT 1. How can we get only the second or  third row for example?   Best regards Samir De : Patrick Cassidy < > À : Envoyé le : Mardi 27 Décembre 2011 1h26 Objet : [Dbpedia-discussion] DBpedia ontology I have looked briefly at the DBpedia ontology and it appears to leave a great deal to be desired in terms of what an ontology is best suited for: to carefully and precisely define the meanings of terms so that they can be automatically reasoned with by a computer, to accomplish useful tasks.  I will be willing to spend some time to reorganize the ontology to make it more logically coherent, if (1) there are any others who are interested in making the ontology more sound and (2) if there is a process by which that can be done without a very long drawn-out debate.    I think that the general notion of formalizing the content of the WikiPedia a a great idea, but to be useful it has to be done carefully.  It is very easy, even for those with experience, to put logically inconsistent assertions into an ontology, and even easier to put in elements that are so underspecified that they are ambiguous to the point of being essentially useless for automated reasoning.  The OWL reasoner can catch some things, but it is very limited, and unless a first-order reasoner is used one needs to be exceedingly careful about how one defines the relations.   I am totally new to this list, and would appreciate pointers to previous posts discussing such issues related to the ontology.  A quick scan of recent posts did not turn up anything relevant to this matter.   Perhaps those who have been particularly active in building the ontology would be willing to discuss the matter by telephone?  This could help educate me and bring me up to date quickly on what has been done here.   Meanwhile, Merry Christmas and Happy Hanukkah and Happy New Year to the members of this group.   Pat   Patrick Cassidy MICRA Inc. 908-561-3416 uHi! If I understand correctly you need to use OFFSET statement. uHi Yury Thank you. Using the OFFSET statement, it means I can do as following after the result is sorted. For example to get the third row:   LIMIT 1 OFFSET 3   Best regards Samir De : Yury Katkov < > À : Samir Bilal < > Cc : \" \" < > Envoyé le : Mercredi 28 Décembre 2011 21h34 Objet : Re: [Dbpedia-discussion] Get a specific row given its rank from an ordered result in SPARQL Hi! If I understand correctly you need to use OFFSET statement." "How to use the "CategoriesClassesToArticlesExtractor" extractor" "uphp / * This file starts the DBpedia extraction process for Abstract * * It will not save into database but instead put into n-triple file inside \"D:\extracted\en\" * You need to import this file manually into virtuoso db * */ error_reporting(E_ALL); // automatically loads required classes require('dbpedia.php'); // set $extractionDir and $extractionLanguages require('extractionconfig.php'); $manager = new ExtractionManager(); // loop over all languages foreach($extractionLanguages as $currLanguage) { Options::setLanguage($currLanguage); $pageTitles = new ArticlesSqlIterator($currLanguage); $job = new ExtractionJob(new DatabaseWikipediaCollection($currLanguage), $pageTitles); $extractionDirLang = $extractionDir.'/'.$currLanguage.'/'; if(!is_dir($extractionDirLang)) mkdir($extractionDirLang); $group = new ExtractionGroup(new csvNTripleDestination($extractionDirLang.'CategoriesClasses_en')); //$extractorInstance = new CategoriesClassesExtractor(); $extractorInstance = new CategoriesClassesToArticlesExtractor(); $group- addExtractor($extractorInstance); $job->addExtractionGroup($group); $date = date(DATE_RFC822); Logger::info(\"Starting CategoriesClasses extraction job for language $currLanguage at $date\n\"); $manager->execute($job); $date = date(DATE_RFC822); Logger::info(\"Finished CategoriesClasses extraction job for language $currLanguage at $date\n\"); } uHello Sarif, I'm at a losslooks like your script should work. Are there any other errors or warnings before the one you included in your mail? Did you make any changes in CategoriesClassesToArticlesExtractor.php? Or are you using the same version as ? Did you change any other files? Did you move any files after you checked out DBpedia from Subversion? Cheers, Christopher On Sun, Dec 20, 2009 at 00:23, sarif ishak < > wrote: uHi again, Looking at \"InstanceTypeExtractor.php\", at line 27 below:" "WordNet links" "uI have only been looking at the WordNet link file for a couple of days, but I have already found a number of problems. I have posted the errors as bugs on the sourceforge bug tracker. But this does not seem to be particularly active!! Does anyone know the processes responsible for the WordNet links, how to suggest changes, etc. etc.?? uHello, Csaba Veres schrieb: We have fixed some bugs in the past days and are preparing a new release. We (DBpedia) still don't get funding, so it is sometimes hard to find sufficient time. The tracker is the right place to post bugs and feature requests. It's indeed a problem to know who is responsible for which data set. @others: Maybe we should add more information at the bottom of the download page [1]? I assigned your reports to Georgi as he may be able help. We can probably solve the problems you posted manually, but I do not know how accurate the WordNet links are in general. Apart from this, you can (with moderate effort) contribute to DBpedia and improve the WordNet extractor if you like. Kind regards, Jens [1] Downloads30" "Arabic chapter" "uhi all the Arabic dbpedia was published in January of this year, i don't found it in this URL ar.dbpedia.org, there is other URL? what news of this project? cordially Ghani hi all the Arabic dbpedia was published in January of this year, i don't found it in this URL ar . dbpedia .org , there is other URL? what news of this project? cordially Ghani uDear ghani Kindly the Arabic chapter of DBPedia is working fine and there is no problem in itThere is deterministic maintenance team for the Arabic DBPedia under control of the Arabic Semantic Web research  group in Fayoum  university. So if you have any problem or questions Plz feel free to ask me directly or Dr.Haytham Al-Feel  as  responsible for this Arabic Chapter. From: Sent: Wednesday, November 9, 2016 11:47 AM To: ; Dr. Haytham Al-Feel Subject: [DBpedia-discussion] Arabic chapter hi all the Arabic dbpedia was published in January of this year, i don't found it in this URL   ar.dbpedia.org, there is other URL? what news of this project? cordially Ghani" "content negotiation?" "uHi, can someone please comment on how content negotiation (that is, Accept in the HTTP header of the request to denote data rendering preference) is currently used in DBpedia? For instance, if I open with the browser I see a page, but I didn't notice if content negotiation is happening behind the scenes. Thank you/danke Gustavo uHi, can someone please comment on how content negotiation (that is, Accept in the HTTP header of the request to denote data rendering preference) is currently used in DBpedia? For instance, if I open with the browser I see a page, but I didn't notice if content negotiation is happening behind the scenes. Thank you/danke Gustavo uHi Gustavo, we have three URIs for each concept. 1. A the abstract concept, 2. A HTML page describing the concept and 3. A RDF file describing the concept. We do content negotiation on the first abstract concept URI, not on the other ones. So if you dereference the concept URI asking for HTML, you end up at the second URI, aksing for RDF you end up at the third URI. Cheers Chris" "XML formats returned by SNORQL interface?" "uWhen I select \"as XML\" in the Results field of attribute set to \"sparql\", when I thought I'd get the XML described at Also, when I select a Result value of \"as XML+XSLT\" I get the same HTML table described above whether I enter a URL for a stylesheet in the \"XSLT stylesheet URL\" field or not. Could someone give me a little more background on these choices? thanks, Bob uBob, It's a bug. An incompatibility between Snorql and Virtuoso. Virtuoso applies content negotiation to the query result, and your browser indicates that it prefers HTML over SPARQL XML Results, so this is what Virtuoso will send Workaround: Append \"&output;=xml\" to the URI to get the XML. Or use a non-browser client (such as curl or wget) to access the URI. Best, Richard On 25 Oct 2008, at 18:28, Bob DuCharme wrote:" "Foreign language Yago classes" "uAre there any plans in the near future to support Yago categories for foreign language Wikipedia pages? Currently, Yago classes only exist for English Wikipedia pages. I'm aware that Yago classes are produced using wordnet which is in English, so it might be difficult to categorize the foreign language Wikipedia pages with Yago classes. But I would also like to hear of any work or progress regarding how one can categorize the foreign language Wikipedia pages if someone has done any work on this. Thanks /Omid uHi Omid, not sure if I understand correctly. The English article about Berlin and the German article about Berlin describe the same real-world city Berlin in Germany, which is dbpedia.org/resource/Berlin Yago classes are assigned to the concept, not to an article. Berlin is a yago:city, independent of whether you're looking at the German or the English Wikipedia article about it. Cheers, Georgi uBerlin is a very trivial example since it's the same name for both pages in English and German. Most pages have different names such as: These two pages refer to the same real world thing but does not have the same Wikipedia page name. Also, how would you categorize a page that does not exist in English Wikipedia, such as: What category does that page belong to? That DBPedia chooses to have URIs such as \" Wikipedia locale in the URI) limits it to not being able to represent the real world thing that refers to \" the English Wikipedia. Thanks /Omid On Wed, Jun 3, 2009 at 4:02 AM, Georgi Kobilarov< > wrote: uDBpedia resources are always identified by the English Wikipedia ID, such as Amazon_Rainforest. If there's no English article for a concept, such as for DBpedia resources for that concept, hence no rdf data. That's at least the current situation. We are aware that this is quite limiting. But for now there's nothing except \"plans\" to change it in the future. Cheers, Georgi" "Lot's of eggs, but where is the chicken? Foundation of a Data ID Unit" "uDear all, I would like to wish you a Happy Easter. At the same time, I have an issue, which concerns LOD and data on the web in general. As a community, we have all contributed to this image: three years old) You can see a lot of eggs on it, but the meta data (the chicken) is: - inaccurate - out-dated - infeasible to maintain manually (this is my opinion. I find it hard to belief that we will start updating triple and link counts manually) Here one pertinent example: -> (this is still linking to DBpedia 3.5.1) Following the foundation of the DBpedia Association we would like to start to solve this problem with the help of a new group called DataIDUnit ( consensus and working code\" as their codex: The first goal will be to find some good existing vocabularies and then provide a working version for DBpedia. A student from Leipzig (Markus Freudenberg) will implement a \"push DataId to Datahub via its API\" feature. This will help us describe the chicken better, that laid all these eggs. Happy Easter, all feedback is welcome, we hope not to duplicate efforts. Sebastian uHi, Sebastian, I’ve been trying to address similar issues to what you describe for a couple of years now. This is the best I’ve been able to piece together. Hopefully it can be useful in your new efforts. Regards, Tim walks a “good” VoID description and down-codes it into the CKAN JSON using the lodcloud group’s conventions [1]. walks a CKAN JSON using lodcloud group’s conventions [1] into “good” VoID descriptions. (special thanks to ww’s modeling in Prizms [2] nodes [3] dereference their own VoID metadata [4] and update their datahub.io listings every week [5]. (They also notify Sindice when they publish new datasets, but that’s a different topic…) [1] [2] [3] [4] [5] On Apr 17, 2014, at 9:49 AM, Sebastian Hellmann < > wrote: uHi Michel, this looks very similar. DataID is just a fancy name to pool efforts under it. Here is a Gretchenfrage: Do you have a list of URLs with existing HCLSDatasetDescriptions ? If yes, could you submit this list here: If we end up with a slightly different format (which I don't think will happen) we can write a small converter. We are also looking for people to join the group and provide more client implementations, i.e. DataId to Jena Assembler or Virtuoso Load Scripts or LOD2 Debian packages. All the best, Sebastian On 17.04.2014 22:55, Michel Dumontier wrote:" "303 redirects oddity for URIs containing ":"" "uHi! Apparently there's something odd with the 303 redirects for resources with \":\" in their title. Basically, that seems to work from for example curl, but it fails from Java. I'm not sure what component is buggy there. Example: $ curl -v -H \"Accept: application/rdf+xml\" < HTTP/1.1 303 See Other < Content-Location: /data/X-Men%3A_Evolution.xml $ curl -H \"Accept: application/rdf+xml\" is fine. $ curl -H \"Accept: application/rdf+xml\" isn't - that strangely returns some foaf triples though (seems these are returned for whatever data/ URI you request). Java seems to get redirected to the latter (broken) URI: url = \" URL urlU = new URL(url); HttpURLConnection uc = (HttpURLConnection) urlU.openConnection(); uc.setInstanceFollowRedirects(true); uc.setRequestProperty(\"Accept\", \"application/rdf+xml\"); uc.connect(); InputStream is = uc.getInputStream(); int read; while ((read = is.read()) != -1) { System.out.write(read); } outputs the triples the last (broken) curl command also fetches. Bug in Java? Bug in Virtuoso? I found a related discussion at [1] but that didn't cover the \":\" case. Regards Malte [1] msg00776.html uMitko Iliev wrote: seems not to disallow colons in URI paths for HTTP at least: path = path-abempty ; begins with \"/\" or is empty path-abempty = *( \"/\" segment ) segment = *pchar pchar = unreserved / pct-encoded / sub-delims / \":\" / \"@\" I'm no expert on this matter though. DBpedia *does* use colons in URIs anyways I did some quick testing with Firefox; it looks like there's no URLDecoding/URLEncoding going on when following Location: headers in 303 redirects there, so Firefox behaves just like Java does. Also interesting: $ curl -v (just normal HTML!) < HTTP/1.1 303 See Other < Location: No escaping going on here when doing the normal HTML request. So I guess this is a bug in Virtuoso when requesting \"application/rdf+xml\" (and a somewhat strange bug in curl perhaps). Regards Malte uThis also affects URLs with ( ), and there also seems to be a bug in the actual triples. Compare: and The one without encoding has the yago:blah triples, the other the normal dbpedia stuff. - Gunnar On 06/05/10 15:06, Malte Kiesel wrote: uAre any of the displaying this behaviour? DBpedia doesn't seem to encode the : between \"Category\" and the category name, even if it percent encodes the category name. On 6 May 2010 01:05, Malte Kiesel < > wrote: uPeter Ansell wrote: Yes, this also concerns Category:XXX resources: curl -v -H \"Accept: application/rdf+xml\" < HTTP/1.1 303 See Other < Location: curl -v -H \"Accept: application/rdf+xml\" (only some foaf:primaryTopic triples there) curl -v -H \"Accept: application/rdf+xml\" (this returns the correct data) curl -v < HTTP/1.1 303 See Other < Location: Note the \":\" does not get transcoded for this redirect (HTML). The same also applies to parentheses (as Gunnar pointed out): curl -v -H \"Accept: application/rdf+xml\" ' < HTTP/1.1 303 See Other < Location: http://dbpedia.org/data/The_Good_Shepherd_%28film%29.xml so here the () got URLEncoded for some reason curl -v 'http://dbpedia.org/resource/The_Good_Shepherd_(film)' < HTTP/1.1 303 See Other < Location: http://dbpedia.org/page/The_Good_Shepherd_(film) while that's not done for normal HTML requests. Additionally, as Gunnar said, there seem to be related (but separate) bugs in the DBpedia extraction framework causing triples for one resource to get scattered over multiple resources with different URI encodings. Regards Malte" "years BC problem" "uHello I'm observing a strange ( ?) behaviour on the dbpedia Sparql-endpoint, and I wonder whether this is due to the underlying data or to the Sparql-endpoint itself (virtuouso). Following the dumps of dbpedia 2014, the data seems OK though. The query select distinct * { dbpedia:Cicero ?o } Returns -043-12-07 (7th of July 43 BC) which is what I expected. However select distinct * { ?s ?o filter(?o < \"-43-01-01\"^^xsd:date) } limit 100 List all sorts of things (mainly persons though, but none with a birthdate before 43 BC). Browsing the dbpedia dumps I found several person born before 43 BC. Does anybody know where this error comes from ? Best regards Johannes Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. u0€ *†H†÷  €0€1 0 + uHi Johannes, Thanks for reporting this, indeed that's a data extraction issue. The incorrect dates seem to come from Persondata templates, e.g. Cheers, Volha On 10/9/2014 1:19 PM, Kingsley Idehen wrote: uHi again, Thanks for your quick answer. But I think the data is correct and a query like select distinct * { ?s ?o filter(?o < \"-43-01-01\"^^xsd:date) } limit 10000 Should return persons born before 43 BC, which it does not. I get either person born in 1 AD (mostly data errors) and even persons born in 1100 AD. So the filter does not work Interestingly if I want persons born after 43 BC with select distinct * { ?s ?o filter(?o > \"-43-01-01\"^^xsd:date) } limit 10000 I get only about 20 results (instead of nearly every person) Is there an error in my sparql query ? (I tried also filter(?o < \"-43\"^^xsd:gYear) etc but observed same problem). Best regard Johannes De : Kingsley Idehen [mailto: ] Envoyé : jeudi 9 octobre 2014 13:19 À : Objet : Re: [Dbpedia-discussion] years BC problem On 10/9/14 3:51 AM, wrote: Hello I'm observing a strange ( ?) behaviour on the dbpedia Sparql-endpoint, and I wonder whether this is due to the underlying data or to the Sparql-endpoint itself (virtuouso). Following the dumps of dbpedia 2014, the data seems OK though. The query select distinct * { dbpedia:Cicero ?o } Returns -043-12-07 (7th of July 43 BC) which is what I expected. However select distinct * { ?s ?o filter(?o < \"-43-01-01\"^^xsd:date) } limit 100 List all sorts of things (mainly persons though, but none with a birthdate before 43 BC). Browsing the dbpedia dumps I found several person born before 43 BC. Does anybody know where this error comes from ? Best regards Johannes Its in the data: 1. 2. ." "Character encoding in SPARQL queries" "uHi all, Am wondering if anyone might have a suggestion as to why the following query might be failing when run against the SPARQL endpoint, but work when executed using SNORQL. After reviewing for a while I am starting to think that it is due to the comma (,) in the dbpedia.org resource string. Here's why: This works via SNORQL: SELECT ?abstract WHERE { ?abstract. FILTER langMatches( lang(?abstract), 'en') } However this doesn't (notice I replaced %2C with (,): SELECT ?abstract WHERE { ?abstract. FILTER langMatches( lang(?abstract), 'en') } The following query, executed via the SPARQL endpoint returns no results. This is the encoded version of either of the above queries: SELECT+%3Fabstract+WHERE+%7B+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource %2FToronto%2C_Ohio%3E+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fabstract %3E+%3Fabstract.+FILTER+langMatches%28+lang%28%3Fabstract%29%2C+%27en %27%29+%7D What's I'm guessing is that, the query is being decoded prior to execution. This decoding is converting the %2C back to a (,) which is no longer a valid dbpedia resource. Can anyone see an error in how I'm going about this, or suggest a method of executing this query against the SPARQL endpoint? Thanks, Matthew Full request / response POST /sparql HTTP/1.1 Accept-Encoding: identity Host: dbpedia.org Content-Type: application/x-www-form-urlencoded Content-Length: 223 Accept: application/rdf+xml query=SELECT+%3Fabstract+WHERE+%7B+%3Chttp%3A%2F%2Fdbpedia.org %2Fresource%2FToronto%2C_Ohio%3E+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty %2Fabstract%3E+%3Fabstract.+FILTER+langMatches%28+lang%28%3Fabstract %29%2C+%27en%27%29+%7DHTTP/1.1 200 OK Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-solaris2.10-64 VDB Connection: Keep-Alive Date: Thu, 28 May 2009 11:48:20 GMT Accept-Ranges: bytes Content-Type: application/rdf+xml; charset=UTF-8 Content-Length: 314 abstract Hi all, Am wondering if anyone might have a suggestion as to why the following query might be failing when run against the SPARQL endpoint, but work when executed using SNORQL. After reviewing for a while I am starting to think that it is due to the comma (,) in the dbpedia.org resource string. Here's why: This works via SNORQL: SELECT ?abstract WHERE { < rdf:RDF> uOn Thu, May 28, 2009 at 12:58 PM, Matt Trinneer < >wrote: Alas, this is the encoded version of the /latter/ query only. The two have different forms, and thus should have different encodings. The encoding of the first should encode the \"%\" in the resource URI as \"%25\", so that your encoded query is: SELECT+%3Fabstract+WHERE+%7B+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FToronto%252C_Ohio%3E+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fabstract%3E+%3Fabstract.+FILTER+langMatches%28+lang%28%3Fabstract%29%2C+%27en%27%29+%7D%0D Note that you can get this from SNORQL directly by grabbing the contents of the URL after ?query= Hope that helps :) Cheers, Peter What's I'm guessing is that, the query is being decoded prior to execution. uOn 28-May-09, at 8:52 AM, Peter Coetzee wrote: Appreciate the help Peter. Should have caught that one myself. Next time Also, thanks for the tip re: SNORQL URLs Best, Matthew" "Running a periodic, automated batch of queries against a live DBpedia" "uHello everyone! I have three questions/requests, hopefully they will be easy ones to answer/implement. First, I would like to run an automated, periodic batch of SPARQL queries against a somewhat up-to-date DBpedia endpoint, and get, at a minimum, the and properties. I am currently using the query: SELECT * WHERE { ?rel ?value . } where $title is inserted from a list of (currently) 125 articles I am interested in, making each request individually and storing the response. I had been running it against while I developed the code, but today saw this thread: and tried changing the endpoint to That server appears to be rate-limited or actively anti-automation, as I got a 503 error on the 9th request (the first 8 went through in a second or two). So, is there a place I can go to, or an API key I can obtain, such that I'd be able to refresh our Wikipedia abstracts on, say, a daily or weekly basis using fresh Wikipedia data? Again, it is a very limited set of articles I am interested in (low hundreds), so the burden on the other end would be fairly minimal, and I can schedule it to whatever time suits you. Secondly, another downside is the two live endpoints mentioned in that thread (the other being have a different set of triples from both each other and from the biannual regular DBpedia. Neither of them include the full abstract that I am interested in. Contrast: Against the very detailed: (though the lod2 one seems to be lacking in @xml:lang attributes on its non-English HTML elements!) Does anyone feel like adding these missing predicates to the live DBpediae? My third question is, all of the endpoints provide a foaf:primaryTopicOf edge pointing to the English wikipedia page — surely it should have all languages? Ideally, I would like links to each of the other language Wikipedia articles with some tie between the abstract and the wikipedia URL it came from (as there's not a 1:1 relation between ISO language codes and wikipedia subdomains, e.g. \"ChÅ«-jiân\"@nan -> http://zh-min-nan.wikipedia.org/wiki/Ch%C5%AB-ji%C3%A2n so trying to generate a URI myself using the refs:label + language code will not always work). How could this be done, and is anyone willing to do it? TBH I would be happy if the URIs were merely string literals tagged with the corresponding ISO language, though that's obviously far from ideal in terms of LOD. Perhaps both string literals and an array of (untagged) foaf:primaryTopicOf triples would be good enough. Of course I'll answer any questions that may help me get to where I want to be." "DBpedia, Yago Class Hierarchy, and Virtuoso Inferencing" "uJens Lehmann wrote: Jens, The current version of Virtuoso has a number of suboptimal issues relating to the initial Inference implementation that includes: Memory management issues (as you encountered) emanating from heuristics for traversing OWL class hierarchies (tree) when processing inference rules. With the fix we now start from the class hierarchy roots and traverse to the bottom of the OWL tree instead of every superclass using a dynamic hash , which grew exponentially. Thus, Virtuoso now uses a list (array of 100 preallocated elements) for a branch of the OWL tree, and when this fills up it extends dynamically. Also, it now checks for subclass definition presence in inference rules based on a hash instead of a linked list. A new Virtuoso release is imminent, the only hold up right now is the completion of Model Providers for: Jena, Redland, and more than likely Sesame. The pervasiveness of these frameworks has encouraged us to make Virtuoso play as well as can be in each of these realms. The goal is to make Virtuoso's performance and scalability features easier for these communities to exploit. Anyway, the live DBpedia has been updated. Important Note: The injection of Yago Class Hierarchy rules into the current data set is an enhancement to the current DBpedia release. Behind the scenes, there is work underway, aimed at providing an enhanced Class Hierarchy and associated inference rules using UMBEL (which meshes Yago and OpenCyc). Once the next Virtuoso release out, those of you with local installation will be able to do the following via iSQL or the Virtuoso Conductor: uHello Kingsley, Kingsley Idehen schrieb: [] [] These are excellent news. Kudos to the OpenLink team for making this possible. :-) Lightweight reasoning on very large knowledge bases is one of the main challenges in the Semantic Web area, so this is another step forward. Enabling inference for DBpedia will (and has already) serve you as a test bed for assessing Virtuoso performance and stabilising further. Kind regards, Jens" "more dbpedia georeferenced datatypes than wikipedia georeferenced articles" "uHello, can anybody explain to me, why it is this following way, that dbpedia has more georeferenced \"things\" or entries around 986000 [ best regards uHi, On 11/09/2012 04:03 PM, wrote: this is because some articles may have multiple values for the coordinates of the place, e.g. If you check its Wikipedia source at [1], you will find that there is a coordinate value in the infobox, as follows: *|latd = 27 |latm = 50 |lats = |latNS = S |longd = 48 |longm = 25 |longs = |longEW = W* and there is also another value in the \"External links\" section with value: *{{Coord|-27.5717|-48.6256|type:city|display=title}}* and since both values are extracted you will have 2 coordinate values for the same DBpedia resource. [1] index.php?title=Florian%C3%B3polis&action;=edit" "Open Positions @ DBpedia" "uAny drupal developers, ontology enthusiasts or organizational talents out there? DBpedia needs you! We are currently looking for a drupal developer/UI volunteer or in-kind contribution for the dbpedia.org website. Your engagement will help to make the content & mission of DBpedia easily accessible and share the DBpedia spirit. Furthermore, the recently founded ontology group needs technical support and someone to manage and extend the DBpedia ontology on a day-to-day basis, i.e. merging requests, fixing code and create validation mechanisms for the ontology. We think that this position is best filled by a PhD student, whose topic is in the area of ontology maintenance and curation or by a developer of a company that uses the ontology for their products. The next position that is needed at DBpedia is someone with great organizational and communicational skills as chair or co-chair for the DBpedia communications group. Your job will be to coordinate dissemination efforts of DBpedia with the support of Sandra Praetor and Julia Holze from the DBpedia Association to give this working group focus and direction. To help young academics in the field of DBpedia to prosper, we are looking for senior scientists that will serve as (education) contacts for Bachelor, Master and PhD students, who write their thesis about DBpedia. You job will be to serve as an initial contact point for interested students, find supervisors as well as to mediate between DBpedia members and students in that regard as well as some minor organizational issues with educational institutions. For a detailed job description, please visit: You are interested in one of the positions? Perfect, just get in touch with us via In case the offered positions are not quite your thing, I am sure we have other options of involvement available for people who want to exploit DBpedia’s great potential. Just send us your ideas! For further information and detailed job descriptions, please check our website: With kind regards, Julia Holze" "Explore Google Summer of Code Projects" "uHi all, You may be interested in knowing that we've hacked up a demo of how DBpedia can be used for exploring open source projects that applied as organizations to the Google Summer of Code [1]. Projects were annotated using DBpedia Spotlight [2]. The demo is here: Thanks to Jimmy O'Regan for the hack idea, Max Jakob for his work on the 9pm-2am shift, a bar in Friedrichshain for the free WIFI, and Jo Daiber for the UI pretty-up on Saturday. By the way, if you are a student, applications will start soon. This program is an *awesome* way to get involved with Open Source! Visit: Cheers, Pablo [1] [2] Hi all, You may be interested in knowing that we've hacked up a demo of how DBpedia can be used for exploring open source projects that applied as organizations to the Google Summer of Code [1]. Projects were annotated using DBpedia Spotlight [2]. The demo is here: spotlight.dbpedia.org" "Resouces in non-English" "uDear all, I would like to ask about the resources in non-English. I've downloaded the file of Portuguese labels to use DBpedia for Portuguese. When I query the word \"Agricultura tradicional \", I cannot obtain any resouces in DBpedia. Actually, we have the information of that word in wikipedia: I think the reason I couldn't get any resources in DBpedia is \" Agricultura_tradicional \" is not written in English. Do you know how DBpedia manages those resources like that and how I can get the information of those results? Thank you, Cheers, Thanh Tu Dear all, I would like to ask about the resources in non-English. I've downloaded the file of Portuguese labels to use DBpedia for Portuguese. When I query the word ' Agricultura tradicional ', I cannot obtain any resouces in DBpedia. Actually, we have the information of that word in wikipedia: Tu uHello Thanh, to a corresponding article in English, that's why DBpedia does not extract any data from it. DBpedia currently only uses those non-English articles that have an 'interwiki link' to the English article, because the DBpedia subject URI is generated from the title of the English article. For example, DBpedia extracts an abstract from to the English article and generates a triple for the subject Bye, Christopher On Wed, Jan 27, 2010 at 16:43, Nguyen Thanh Tu < > wrote:" "Images in DBpedia now link back to Wikipedia" "uAll, Images on DBpedia pages now link back to the Wikipedia image page that describes the picture and states its copyrights. I made this change in response to a complaint that we show these images without proper attribution. I think that adding these links should address the complaint. I use a mod_rewrite hack to create the proper URL for the Wikipedia image page. I *think* it should work for all images, but I'm not sure. I would appreciate some help with testing this. Please have a look at your favourite pages and check if the image is a link, and if clicking the image takes you to the right place. Please report any problems to me. Cheers, Richard" "missing values in query results" "uDear all, I did the following query to the dbpedia through www.dbpedia.org/sparql PREFIX geo: PREFIX p: SELECT ?player ?club WHERE{ ?player p:cityofbirth . OPTIONAL {?player p:currentclub ?club}} In it I get the following results: player club football) Next I do the following query (in which I query for all the players who belong to a club): SELECT ?player ?club WHERE { ?player ?club .} And I get results but I do not get the following results: http://dbpedia.org/resource/Tore_Andr%C3%A9_Flo http://dbpedia.org/resource/Milton_Keynes_Dons_F.C. http://dbpedia.org/resource/Per_Egil_Flo http://dbpedia.org/resource/Sogndal_Fotball http://dbpedia.org/resource/H%C3%A5vard_Flo http://dbpedia.org/resource/Retired The purpose of these queries is to do locally the join operator but I get different results from the basic queries so I do not get the proper results. Any idea? I do not know what I'm missing :S Thanks, Carlos u2009/11/3 Carlos Buil Aranda < >: In the first query you used OPTIONAL so that you could get some results for players even though they don't have a current club. You will notice that you don't have any clubs matching those missing players in the results of the first query because of the OPTIONAL. If you take the OPTIONAL out you will get the same results from both queries unless any of the players with current clubs do not have city of birth defined in wikipedia. Cheers, Peter uThe point (if I'm not mistaken) is that these players do have a current team: SELECT ?player ?club WHERE { ?player ?club .} But he appears in the query: PREFIX p: SELECT ?player ?club WHERE{ ?player p:cityofbirth . OPTIONAL {?player p:currentclub ?club}} Carlos uHi all, I've been trying several queries and I did not realize that I'm querying a too large dataset and most probably there is a limit on the results (right?, select * where {?player p:currentclub ?club}). Is it possible to get all the results? Maybe an incremental result from the query or similar? I'm using Jena to query all these data. Thanks!! Carlos uCarlos Buil Aranda wrote: You can get up to 1000 or 2000 result sets per query. If you want more, you can use ORDER BY combined with LIMIT and OFFSET for up to 40000 result sets. YMMV - it's been a while for me. I've since set up my own local copy as I need to run many queries :) Regards, Michael" "Literature on tracking changes in generic databases?" "uHere's a (not so) hypothetical question. Suppose I've got some database that was populated from some data source like Freebase or DBpedia. I've got my own internal identifiers, so I can say that dbpedia:George_Washington -> mysystem:88582 fbase:/en/george_washington -> mysystem:88582 The whole reason I'm doing this is so I can make assertions about these entities. Some of these come from Freebase and DBpedia and some of them come from other places. If I look at a single moment in time, I feel like I've got this under control. I can load DBpedia version X or a freebase data dump from a particular week and it all works perfectly. Of course, at some point my data gets stale and I need to update it. For instance, there's a new batch of players who got drafted by the NFL I want to know about, or I want to know about movies that are going to be released soon, or have some vocabulary for the Fukushima muclear accident or maybe I just try to extract some facts from the latest Freebase dump and discover that a distressing number of mids have changed in the last six months. So I need some way to keep my knowledge base synced up with changes in external knowledge base(s) although I might settle for treating one as authoritative. It's not clear that there's sufficient information in DBpedia to track changes (although some renames could be tracked by following the wikipedia id's) but I think a good job can be done with Freebase: mid redirect records can be used to follow renames and merges, and at least sometimes with splits there are gardening hints that help. I can imagine heuristics that would probably work on an ad hoc basis, but it seems to me there ought to be an intellectually consistent approach to this problem. Is anyone familiar with anything in the literature on this?" "regarding dbpedia endpoint" "uHello, I have been using the following query in the dbpedia virtuoso PREFIX rdfs: PREFIX owl: SELECT ?class WHERE { ?class rdfs:subClassOf owl:Thing . ?person rdf:type ?class . ?person \"India\"@en. } I get three results there, but when I use Jena to get the results I just receive a single result. Could you please help me in getting all the results correctly in JENA API. RESULTS are : class In Jena, result is : ( ?class = ) -> [Root] ( ?class = ) -> [Root] uMariya, It sounds like your question is more Jena related than DBpedia related. You may get better help at Have you tried that? Also, I assume you're pointing Jena to the same SPARQL endpoint? Or are you loading data from a file? Best, Pablo On Wed, Oct 26, 2011 at 4:29 AM, Mariya.Pervasive < >wrote:" "Bug in image data (CSV file)" "uIt seems that the image extractor may have a bug. I think it's cutting the URLs when it finds characters other than [A-Za-z0-9_/]. Errors are easy to find, just run: grep /depiction image_en.csv | grep -v 200 | head (this finds some of the errors, but not all). Here are the first few: %40Mail Adam_Meredith Airline_Highway Bir-Hakeim_%28Paris_M%C3%A9tro%29 If you access Adam_Meredith, for example, the image URL is: http://upload.wikimedia.org/wikipedia/commons/2/26/Plum%4072.jpg The bug happens on both img and depiction fields. The nt file does not have this problem. Since I'm processing the CSV files instead of the NT ones, please post when you find the bug if it is affecting only the images file or other files as well" "Mashup examples" "uPlease forgive the self-promotion, but I have two examples of XQuery/DBpedia mashups now running (cross-fingers) which may be of interest: UK football clubs with maps of the birthplaces of their players: Rock and Roll Groups and their discography as a SIMILE timeline Code (such as it is - I hope to learn better SPARQL from Andy Seaborne at HP next door ! ) is in the XQuery Wikibook More work needed on synonyms and redirects but I hope they will inspire my students - maybe even to do some work on Wikipedia editing. Chris Wallace UWE Bristol This email was independently scanned for viruses by McAfee anti-virus software and none were found uI love self-promotion when people promote such nice demos :-) Cheers Chris uHi Chris, your mashups look great! It’s a pleasure to see people actually using DBpedia for cool projects like yours. I’m interested in your experiences of using our dataset as we are keen to get to know our users and assist them wherever possible in order to make DBpedia a useful data source. So if you’re at HP Labs to talk to Andy, feel free to pop in, I’m sitting just round the corner ;) Cheers, Georgi uChris Wallace writes: Hmmm. We appear to be lacking data for Queen and REM ~Tim uTim Thanks for spotting that - it's not missing data though, just a uri-encoding problem. I realize now that URIs in SPARQL must be uri-encoded, and XQuery was decoding the parameter so it needs re-encoding when constructing the SPARQL query. Chris" "SPARQL endpoint on local installation of the DBpedia extraction framework" "uHi, How do I do to have a SPARQL endpoint on a local installation of the DBpedia extraction framework? Thanks José Paulo Leal begin:vcard fn;quoted-printable:Jos=C3=A9 Paulo Leal n;quoted-printable:Leal;Jos=C3=A9 Paulo org;quoted-printable:Universidade do Porto;Departamento de Ci=C3=AAncia de Computadores adr:;;DCC - R. do Campo Alegre, 1021/1055;;;4169-007 PORTO;Portugal email;internet: title:Professor Auxiliar tel;work:+351 220 402 973 x-mozilla-html:FALSE url: version:2.1 end:vcard" "TBox Meshup: opengraph meets dbpedia and beyond" "uAll, One of the cool things that came out of yesterday's opengraph schema enhancements was the mapping to DBpedia's ontology. Naturally, the effects of this aren't obvious if you can't even view the ontology etc Here is a link that shows the effect of TBox meshups: BTW uKingsley, when I open your link [1] I see the statements I think it'd be better to solve the existing issues with the data in DBpedia beforeadding arbitrary new ones. Georgi uGeorgi Kobilarov wrote: Georgi, The context of my demo is this: 1. Yesterday, the opengraph folks were gently introduced to the power of Linked Data via schema mapping 2. The people who participated in the mapping smartly added an owl:equivalentClass mapping that hooked into the DBpedia ontology 3. I then put out Web page URL to demonstrate expansive effect of this single TBox assertion (Fred Giasson used to call this Domain Explosion) . That's it. Nothing to do with other imperfections that may exist in DBpedia or the DBpedia ontology. If anything, should you follow-your-nose, my demo takes you to Yago, OpenCyc and UMBEL, at each stop giving the \"bholder\" an option to explore the underlying ABox data via different \"Context Lenses\". Anyway, to your specific point: Fixing DBpedia and mapping to DBpedia are tasks that can occur in parallel. Note (or remember): the crosslinks exist in a separate Named Graph from the main DBpedia Graph (go SPARQL against the DBpedia graph and you won't find this data). My page is what brings the two graphs together :-) Kingsley" "Virtuoso/ DBpedia VAD wrong encoding in RDF/XML data" "uHi, I've recieved a mail a couple of weeks ago from some users of the German DBpedia a few weeks ago who where reporting that they weren't getting any results when querying the endpoint for URIs that contained German umlauts(or any other utf8 characters). I reported the issue to the Jena mailing list and they fixed it, but in the process we also discovered a bug with Virtuoso. There is a problem with the IRI encoding in the DBpedia Internationalization VAD. Namely when querying the SPARQL endpoint the encoding of the IRIs in RDF/XML is garbled. The issue can be found in both Greek and German endpoints. For example: XML lines yo you will notice things linke simmilar issues if you look at this resource from the Greek DBpedia: This problems is that when querying the Internationalization Endpoints not only with Jena but with any other SPARQL client, the user is going to getting garbled IRIs if they contain UTF8 characters. Kind Regards, Alexandru Todor uHi Alexandru, This is a known issue and we reported it to virtuoso ~9 months ago. Unfortunatelly we use debian packages for our installation which usually are a little behind from the latest releases, so we can't say if it is fixed But, IRIs cannot be 100% serialized in RDF/XML. So even if Virtuoso fixes the encoding, the rdf might still be invalid Regards, Dimitris On Mon, Oct 17, 2011 at 6:42 PM, Alexandru Todor < > wrote: uHi Dimitris, They haven't fixed it yet since I'm using the latest Open Source version compiled from source. It is a simple encoding issue that can be worked around in java with a simple hack: output = new String(input.getBytes(\"ISO-8859-1\"), \"UTF8\"); . I don't have much knowlege about the different character sets, but I think they are encoding the UTF8 URIs in ISO-8859-1 instead of UTF8. This doesn't happen with the literal values since the UTF8 characters are escaped as ASCII. About the validity, the problem I've noticed is with the Property names, certain special characters such as brackets have to be filtered in order for it to be valid XML. I've had this issue with the German DBpedia, and as you can see right now all resourced produce valid XML and can be queries trough Sparql clients. If however there is a deeper problem with IRIs in RDF/XML that I'm unaware of, we should discuss it and push for another default serialization format for SPARQL. Regards, Alexandru On 10/18/2011 09:29 AM, Dimitris Kontokostas wrote: uHi, It would be quite nice to get an answer about this issue from someone at OpenLink since it seems that they do read this mailing list and this is a known issue. BTW I need to correct the title of this mail. The issue is not with the DBpedia VAD, it is with Virtuoso itself since the SPARQL endpoint returns the same garbled results. So at this time the Virtuoso IRI handling is broken at least when using SPARQL . Kind Regards Alexandru On 10/18/2011 09:29 AM, Dimitris Kontokostas wrote: uHi Alexandru, I have passed on your observation to the Virtuoso development team and i am awaiting an answer. Patrick uHi Patrick, Thank you. Just to specify what I mean by broken IRI support. I know IRIs work in Virtuoso quite good, better than in most other RDF Stores and it's just the RDF/XML serializer that has a small encoding bug, but RDF/XML seems to be the default serialization for SPARQL answers. People just use a common RDF framework, try to query the endpoint and get garbled results, after which they complain about the endpoint not working right. I know you can specify another serialization format like N3 or Turtle or use a small hack and get the right encoding, but I found that out the hard way as most people who try to query any Internationalized DBpedia endpoint will do. Kind Regards, Alexandru On 10/19/2011 05:08 PM, Patrick van Kleef wrote:" "DBpedia type labels" "uHi, In which file can I find the labels for DBpedia 3.7 types. For example, Mohamed uHello Mohamed , you have to go to the DBpedia 3.7 downloads page and download the Titles dump for the language you need On Wed, Sep 11, 2013 at 1:20 PM, Mohamed Yahya < >wrote: uThanks. I don't think the titles for types (under the namespace tried: wget bzcat labels_en.nt.bz2 | grep \"/ontology\" | wc -l It returns 0. On Wed, Sep 11, 2013 at 1:39 PM, Hady elsahar < > wrote: usorry didn't catch that , for owl classes you will find it's labels inside the DBpedia Owl file curl -s -S -I file -o ntriples | less On Wed, Sep 11, 2013 at 2:03 PM, Mohamed Yahya < >wrote:" "editing properties in mappings wiki" "uHi all! Some questions concerning editing the dbpedia ontology properties in the mappings wiki: 1. What is the best practice to set the values for rdfs:domain, rdfs:range, owl:equivalentProperty etc.? For example I changed the range of this property to rdfs:Literal to make it compatible with bibo:issn (as the owl:equivalentProperty). Do I have to use the full URI 2. What prefixes are allowed in the ontology anyhow? I could not find a hint in the wiki. 3. Why does \"rdfs:range = xsd:string\" link \"xsd:string\" to 4. One more question: Is owl:equivalentProperty repeatable? Thanks in advance! Carsten Carsten Klee Abt. Überregionale Bibliographische Dienste IIE Staatsbibliothek zu Berlin - Preußischer Kulturbesitz Potsdamer Straße 33 10785 Berlin Fon: +49 30 266-43 44 02 Fax: +49 30 266-33 40 01 www.zeitschriftendatenbank.de uDear Carsten, it seems you found all the weak spots, the wiki currently has ;) So you actually found three bugs they are not critical, I think, as there are work arounds. 1. Yes, I think so 2. There is a list somewhere. It is hardcodedwhich is bad 3. I can not answer this, it might be a bug 4. what do you mean by repeatable? equivalentProperty is a transitive, symmetric and reflexive equivalence relation. All the best, Sebastian On 03/06/2012 01:47 PM, Klee, Carsten wrote: uHi Carsten, hi Sebastian, On Mar 6, 2012, at 8:39 AM, Sebastian Hellmann wrote: this is actually not a bug. The first example property is an object property, which means xsd:string is interpreted as of namespace OntologyClass where there is obviously no class of that name. The second example property is a datatype property where xsd:string is a valid range and green and linked to the correct wiki page. You can also click the help link on the table top to get help for all the available template properties and their values. Cheers, Anja uHi Anja, hi Sebastian! Thanks for your reply. As I understand \"transitive\" means that dbprop:issn owl:equivalentProperty prism:issn when I state dbprop:issn owl:equivalentProperty bibo:issn because of bibo:issn owl:equivalentProperty prism:issn So repetition is not necessary. As to your answer to question 1: I will create the classes rdfs:Literal and rdfs:Resource. But I don't understand why all the vocabularies are created again in the wiki. The Link in Cheers, Carsten Carsten Klee Abt. Überregionale Bibliographische Dienste IIE Staatsbibliothek zu Berlin - Preußischer Kulturbesitz Fon: +49 30 266-43 44 02" "How do install dbpedia on mediawiki under XP" "uI have Mediawiki running under XP. I am lost on how to install dbpedia and how to get it going. Thanks uHi, please have a look at I assume you've already installed Xampp, so Apache and php is running on your machine. what do you intend to do with DBpedia? Building own extractors and/or extending the framework? Or just use the data? Cheers, Georgi From: on behalf of primestarguy Sent: Wed 26/09/2007 19:41 To: Subject: [Dbpedia-discussion] How do install dbpedia on mediawiki under XP I have Mediawiki running under XP. I am lost on how to install dbpedia and how to get it going. Thanks Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. Hi, please have a look at out. uShould have given more info. Using IIS V5.1 with XP Pro PHP 5.2.3 mediawiki-1.10.1 mysql 5.0 So I have 4 items that then have 5 info boxs each with 12 to 15 items of two rows in each info box. All the data is stored local and don't need to access anything remote. Need to answer questions like which item(s) have which feature. Thanks for the response and will keep reading. Georgi Kobilarov < > wrote: Hi, please have a look at I assume you've already installed Xampp, so Apache and php is running on your machine. what do you intend to do with DBpedia? Building own extractors and/or extending the framework? Or just use the data? Cheers, Georgi" "Retrieving data" "uHello,. I am a newbie, in the DBPedia project. I would like to know how can I use that, and what tools do I need installed on my machine. I saw some code samples but do not know where to look for a schema of the wiki. Please address me to the proper links so I can study that. Regards, Amit uHi Amit, On 11/21/2011 02:56 PM, Amit Bueno wrote: In order to get a DBpedia instance running, you should install OpenLink Virtuoso [1], and then load the datasets of the language(s) that are of interest to you. You can find the datasets of the DBpedia 3.7 at [2], which are available in N-Triples format. You can find the source code of the DBpedia project at [3]. The DBpedia framework is java/scala based. You can find a DBpedia SPARQL endpoint at [4]. You pose some SPARQL queries to that endpoint and get results for your queries, in order to get used to the DBpedia project. Hope that helps :). [1] [2] [3] [4] sparql uOn 11/22/2011 12:45 AM, Mohamed Morsey wrote: You can find the ontology and mappings of DBpedia at [5]." "DBpedia SVN mailing list" "uHello, for DBpedia developers, there is now a mailing list available, which automatically receives all SVN commits. If you want to receive SVN log messages and svn diffs (up to max. 100 KB per message), you can subcribe to : Kind regards, Jens PS: Developers will get a mail \"Your message to dbpedia-svn awaits moderator approval\" when they first commit source code. When this happens (and the message is indeed a DBpedia SVN commit), I will accept further messages/commits from those sf.net mail addresses, i.e. you can ignore this message." "Princesses: bump & grind?" "uIf there is an 'external links' section pointing eg. from a page about a tv show, to BBC's page for that show, do the dbpedia extractors do anything? Am assuming not, thinking that you're just doing infoboxes currently. This would be a nice extension since at least for the BBC case there is structured data to be had in RDF from the other end of the link (via either .rdf or conneg I think). IMDB and other URLs would be useful too, since they can be connected to users via bookmarks, link sharing mechanisms, user profiles etc. Not sure what to do about the risk of false positives though. Somewhat similar: ISBN extraction. Via million compared to the 15k in dbpedia; however these are \"mentions of ISBNs\" not necessarily pages about the relevant books. Dan uHi Dan, Yes, currently we are only extracting data and serve it on the Web. Yes, providing applications with access to the data on both sides of the link is clearly the goal, but I wonder whether this is a job for DBpedia or a job for a Semantic Web search engine like Sindice, SWSE or Falcons which crawls data from the Web and should provide apps with access to aggregated (maybe even merged and fused) data? For DBpedia our goal is to provide clean Wikipedia data and clean links pointing at other data sources. In the context of other projects, I hope to have the chance to work more on all these exciting aggregation, merging, fusion, information quality and trust problems next year :-) Still very valuable :-) As we are currently mostly doing infobox extraction we miss most of them. We might next year move to extracting all templates from a Wikipedia article and then might hopefully get more of them. Cheers, Chris" "Installing dbpedia vad for virtuoso" "uHello, I'm trying to make a mirror of dbpedia with virtuoso, I've successfuly imported dbpedia data. I can use the sparql interface without problem and conductor works fine too. After that i've been trying to install the dbpedia vad, I can now access the /page/Entity pages, but there's only the owl:sameAs property displayed. Where might it come from and how can I fix this ? Thank you, Hello, I'm trying to make a mirror of dbpedia with virtuoso, I've successfuly imported dbpedia data. I can use the sparql interface without problem and conductor works fine too. After that i've been trying to install the dbpedia vad, I can now access the /page/Entity pages, but there's only the owl:sameAs property displayed. Where might it come from and how can I fix this ? Thank you, uHello, Did you load the triples in the \" Best, Dimitris On Mon, Apr 21, 2014 at 10:19 PM, Romain Beaumont < >wrote: uI got the triples from script and run ld_dir_all(' ', '*.*', ' then run rdf_loader_run(); SELECT * FROM DB.DBA.LOAD_LIST; show that ll_graph is the dumps How can I check if the triples are actually loaded under the \" 2014-04-21 21:28 GMT+02:00 Dimitris Kontokostas < >: uCan you try to remove the \"/\" from the graph? I think Virtuoso has a graph rename function (also from the conductor interface) Dimitris On Mon, Apr 21, 2014 at 10:39 PM, Romain Beaumont < >wrote: uI tried to, but as the dbpedia db is pretty big (a 44Go db virtuoso file), it doesn't work with the conductor interface (it just fill up the 16Go ram of my server), and the isql-vt interface doesn't work much better for this (I get \"Transaction aborted because it's log after image size went above the limit\") But that might still be the problem (I've seen \" 2014-04-21 21:45 GMT+02:00 Dimitris Kontokostas < >: uok so, doing log_enable(2); before UPDATE DB.DBA.RDF_QUAD TABLE OPTION (index RDF_QUAD_GS) SET g = iri_to_id (' = iri_to_id (' in isql-vt seems to go beyong that \"Transaction aborted because it's log after image size went above the limit\" error but now it just start using all the RAM againI guess deleting the db and starting from scratch will work, but if anyone know how to rename a big graph in virtuoso that would be helpful ? 2014-04-21 22:41 GMT+02:00 Romain Beaumont < >: uHi Romain, It seems you have been following the following tip on renaming a graph in Virtuoso: You can also increase the \"TransactionAfterImageLimit\" INI file param to get around the \"Transaction aborted because it's log after image size went above the limit\" error if you have sufficient free memory available as detailed at: but then it probably would be quicker to reload the datasets into an empty database from scratch Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // Weblog uYes I've been trying to do - log_enable(3) and increasing TransactionAfterImageLimit : I still get the error \"Transaction aborted because it's log after image size went above the limit\" - log_enable(2) : it uses too much memory and end up using the swap : I didn't let it run to its end but I suspect it would take a long time because it's using the swap So I've decided to simply reload the database from scratch, it will probably be faster that way. Thank you for your help ! 2014-04-22 0:17 GMT+02:00 Hugh Williams < >: uReloading the database with the correct graph name ( not Thank you ! 2014-04-22 0:43 GMT+02:00 Romain Beaumont < >:" "Semantic Search Evaluation and Gold Standard" "u[APOLOGIES FOR MULTIPLE POSTINGS!] For the Evaluation of Semantic Search we need your help!! We are developing new approaches of semantic search, which of course have to be evaluated - and this is where we need you! Please help us to assess the quality of our algorithms by telling us which documents are relevant to a given search query, and by comparing different rankings. The web based evaluation can be done anywhere, anytime in the next two weeks, at You can also stop or pause it whenever you want and continue later. Not only that you will help us, but also any future research in this area, because we are going to publish a gold standard for semantic search based on your judgments. Helping fellow students and researchers should be a good enough motivation, but if it is not maybe the chance to win an Amazon gift card is? So, at the end of the evaluation period, all participants have the chance to win one of ten 10€ Amazon gift cards!! Thank you very much - we really appreciate your help! Best regards, Harald Sack Dr. Harald Sack Hasso-Plattner-Institut für Softwaresystemtechnik GmbH Prof.-Dr.-Helmert-Str. 2-3 D-14482 Potsdam Germany Amtsgericht Potsdam, HRB 12184 Geschäftsführung: Prof. Dr. Christoph Meinel Tel.: +49 (0)331-5509-527 Fax: +49 (0)331-5509-325 E-Mail: sack.html" "DBPedia ontology parsing" "uHi, I tried to download DBPedia ontology (from here The point is that I need to determine the hierarchy level of extracted concepts. For example, if concept was recognized as Game - assign level 2, while the concept recognized as Activity should be level 1 (Thing - Activity - Game). I think that all I need is whole ontology in format that is suitable for parsing. But the ontology (one or more) should include original DBPedia ontology, as well as Schema.org and Freebase types. Thanks, Srecko uOn 02/20/2013 06:31 PM, Srecko Joksimovic wrote: Did you download that file [1], or something else. [1] dbpedia_3.8.owl.bz2 uI thought I need the other three as well. Thanks, I'll try with that one. But, is everything there? When I issue a call to DBPedia service, I get concepts such as Freebase:\"something\". and I couldn't find one in this file. Best, Srecko From: Mohamed Morsey [mailto: ] Sent: Wednesday, February 20, 2013 18:54 To: Srecko Joksimovic Cc: Subject: Re: [Dbpedia-discussion] DBPedia ontology parsing On 02/20/2013 06:31 PM, Srecko Joksimovic wrote: Hi, I tried to download DBPedia ontology (from here Did you download that file [1], or something else. The point is that I need to determine the hierarchy level of extracted concepts. For example, if concept was recognized as Game - assign level 2, while the concept recognized as Activity should be level 1 (Thing - Activity - Game). I think that all I need is whole ontology in format that is suitable for parsing. But the ontology (one or more) should include original DBPedia ontology, as well as Schema.org and Freebase types. Thanks, Srecko [1] dbpedia_3.8.owl.bz2 uHi Srecko, On 02/20/2013 07:02 PM, Srecko Joksimovic wrote: This is the DBpedia ontology only. Actually DBpedia is linked to many other knowledge bases, e.g. FreeBase, and this list is available here [1]. Hope that clarifies it. [1] Downloads38#h236-1 uActually, yes. Thank you very much. I think this solves my problem. Best, Srecko From: Mohamed Morsey [mailto: ] Sent: Wednesday, February 20, 2013 19:22 To: Srecko Joksimovic Cc: Subject: Re: [Dbpedia-discussion] DBPedia ontology parsing Hi Srecko, On 02/20/2013 07:02 PM, Srecko Joksimovic wrote: I thought I need the other three as well. Thanks, I'll try with that one. But, is everything there? When I issue a call to DBPedia service, I get concepts such as Freebase:\"something\". and I couldn't find one in this file. This is the DBpedia ontology only. Actually DBpedia is linked to many other knowledge bases, e.g. FreeBase, and this list is available here [1]. Best, Srecko Hope that clarifies it. [1] Downloads38#h236-1 uHi, maybe I'm missing something, but I would appreciate your help on this one.If I use types like here and result is Freebase: Architecture/Architect, how can I determine the level of the resulting concept in the ontology? I understand that DBPedia is linked to other ontologies, but if I understand well, it should be possible to issue query such as this one I’m trying to implement… Srecko On Wed, Feb 20, 2013 at 8:30 PM, Srecko Joksimovic < > wrote: uHi Srecko, On 02/21/2013 12:29 PM, srecko joksimovic wrote: This thread might help [1] [1] msg00316.html uHi Mohamed, Looks like this is what I'm looking for. Thanks, Srecko From: Mohamed Morsey [mailto: ] Sent: Thursday, February 21, 2013 21:33 To: srecko joksimovic Cc: Subject: Re: [Dbpedia-discussion] DBPedia ontology parsing Hi Srecko, On 02/21/2013 12:29 PM, srecko joksimovic wrote: Hi, maybe I'm missing something, but I would appreciate your help on this one. If I use types like here Freebase, Schema.org), and result is Freebase: Architecture/Architect, how can I determine the level of the resulting concept in the ontology? I understand that DBPedia is linked to other ontologies, but if I understand well, it should be possible to issue query such as this one I'm trying to implement. This thread might help [1] Srecko [1] ml" "dump file naming conventions and formats" "uAm new to DBPedia. researching on using DBPedia for advanced analytics. I couldn’t find any references to the file formats and naming conventions used in dumps. For example, instance_types_*.*.bz2. How do I understand what is instance_types means and what is ’nt’, ’nq’, ’t/l' Thanks" "DBpedia Lookup PrefixSearch still down" "uDoes somebody know where the service runs, and whom to contact for it? Cheers, Joachim uHi Joachim, (Just trying to add to your cause): I have the same problem when trying to define a prefix dbpedia-owl from within LOD Refine. Please could somebody look into this matter? Regards, Gerard Van: Neubert Joachim [ ] Verzonden: maandag 31 maart 2014 10:39 Aan: 'dbpedia-discussion' Onderwerp: [Dbpedia-discussion] DBpedia Lookup PrefixSearch still down Does somebody know where the service runs, and whom to contact for it? Cheers, Joachim Disclaimer Dit bericht met eventuele bijlagen is vertrouwelijk en uitsluitend bestemd voor de geadresseerde. Indien u niet de bedoelde ontvanger bent, wordt u verzocht de afzender te waarschuwen en dit bericht met eventuele bijlagen direct te verwijderen en/of te vernietigen. Het is niet toegestaan dit bericht en eventuele bijlagen te vermenigvuldigen, door te sturen, openbaar te maken, op te slaan of op andere wijze te gebruiken. Ordina N.V. en/of haar groepsmaatschappijen accepteren geen verantwoordelijkheid of aansprakelijkheid voor schade die voortvloeit uit de inhoud en/of de verzending van dit bericht. This e-mail and any attachments are confidential and are solely intended for the addressee. If you are not the intended recipient, please notify the sender and delete and/or destroy this message and any attachments immediately. It is prohibited to copy, to distribute, to disclose or to use this e-mail and any attachments in any other way. Ordina N.V. and/or its group companies do not accept any responsibility nor liability for any damage resulting from the content of and/or the transmission of this message. uHi folks, Thanks for your heads up on the service being down. :-) The service is kindly maintained as a voluntary effort, for free, by the fine folks at U. Mannheim. The source code is also shared for free on Github for anybody that would like to run their own instance. Heiko Paulheim, Volha Bryl and Chris Bizer may be able to help you to get in touch with the sysadmin that keeps this up for us. While we're at it, let's make sure to thank them for keeping this service available for the community! Cheers Pablo On Mar 31, 2014 1:40 AM, \"Neubert Joachim\" < > wrote: uHi Pablo, I've forwarded this to our admins. Will keep you posted. Best, Heiko Am 31.03.2014 19:00, schrieb Pablo N. Mendes:" "Evaluation on automatic domain/topic identification on Linked Open Datasets - Please Particpate" "uApologies for cross-posting Hi All, We are working on an approach to automatically identify the domains/topics of LOD datasets. In order to evaluate our approach we would like to ask you to participate on our user study. Here we present 30 LOD datasets along with terms which potentially describe the datasets. Among these terms, some are more appropriate to use as descriptors of the datasets than others. As a user you are supposed to select the terms that best represent the domains/topics of the dataset. Here is the link for the study Your help is very much appreciated. Thank You Best Regards Sarasi Lalithsena Apologies for cross-posting Hi All, We are working on an approach to automatically identify the domains/topics of LOD datasets. In order to evaluate our approach we would like to ask you to participate on our user study. Here we present 30 LOD  datasets along with terms which potentially describe the datasets. Among these terms, some are more  appropriate to use as descriptors of the datasets than others.  As a user you are supposed to select the terms that best represent the domains/topics of the dataset. Here is the link for the study Lalithsena" "Easiest deployment and querying of DBpedia without Amazon EC2 ?" "uHi there, I am currently looking (again) for the easiest way to deploy DBpedia data so I can run expensive queries on it. Recently there have been a lot of announcements regarding the availability of DBpedia via Openlink Virtuoso and Amazon EC2. Thats great, and I realise the potential of it, but lets just imagine that I have enough physical hardware resources here, but I dont want to have the overhead of setting up DBpedia data in Virtuoso: What is the easiest option for deployment and querying of DBpedia without Amazon EC2 ? Is there, lets say, a vmware image containing the same virtual machine as the EC2 AMI image ? Did anybody extract or convert the EC2 AMI image to something else? cheers, Benjamin. uBenjamin Heitmann wrote: uOn 9 Mar 2009, at 18:48, Kingsley Idehen wrote: Wow, both sound like great options, and I am glad they are available. Can you point me to the documentation, which explicitly describes each of these two options? If you provide e.g. a vmware virtual appliance then you can list it on the VMware marketplace. Details about that are here There is the possibility of a \"community contributed virtual appliance\", but I suspect that some place for hosting the download needs to be provided, yes. uHi Benjamin, The DBpedia installer script can be downloaded from the following location and contains a readme file detailing its usage: The current DBpedia backup we have is hosted in an amazon S3 storage location and thus you would need an amazon account to access it, restoring would then be pretty much as detailed for the EC2 AMI installation except you would not start an Virtuoso EC2 AMI instance, using your locally hosted Virtuoso installation to restore to instead: VirtEC2AMIDBpediaInstall Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 9 Mar 2009, at 19:02, Benjamin Heitmann wrote:" "Downloads2014 Surface Forms do not work" "uThe download page for DBpedia 2014 contains a section for the data set \"Surface Forms\" in nt, nq and ttl format but none of the links to these files, nor to the preview files work. They all lead to a html page instead that shows \"It works\". Here are the links to the full files as present in the page: Are these supposed to work? Thanks, Johann The download page for DBpedia 2014 Johann" "United Nations FAO geo-political ontology & dbpedia" "uGunnar Aastrand Grimnes wrote: Gunnar, How have you concluded that FAO and DBpedia aren't cross linked? I mapped them myself and uploaded to their own Named Graph (DBpedia and LOD Cloud Cache instances). I even mapped FAO to SUMO. Also note SUMO was already mapped to DBpedia by Adam Pease and I also loaded those mappings to their own Named Graph. Again, search via: 1. 2. Mitko/Patrick: I am in transit, so please double check that the named graphs with these mappings are still in place etc Kingsley uDear Soonho, I didn't see that the topic was renamed, so I also answered in another post. Here are some additions: Am 11.08.2010 15:40, schrieb Kim, Soonho (OEKM): As I wrote, there should be a system to upload and maintain data sets. As far as I know, there currently is no such system and none is planned currently (@all: correct me, if I am mistaken). All other sources where added manually, which was a good way in the beginning, where there have been only few datasets. This does not scale up however, as there could be a potentially large amount of mappings, which need to be maintained frequently. Even if they are URIs, there is no need to import anything. If the you use a owl:sameAs link and there is no other data available, then it will not affect anything. In other words, every application has the choice to import or not import additional data. Regards, Sebastian" "Dbpedia Lookup Service down" "uHi to all :), Second day now, I'm having a problem with Dbpedia Lookup Service. No matter what I try to find using this service I get the 503 Service Temporarily Unavailable message. The latest thing I've tried is calling the following Keyword Search method: and I still get the same message. Is it possible to fix it? Cheers, Nemanja Hi to all :), Second day now, I'm having a problem with Dbpedia Lookup Service. No matter what I try to find using this service I get the 503 Service Temporarily Unavailable message. The latest thing I've tried is calling the following Keyword Search method: Nemanja uNemanja, Thank you for your message. Our machine running the lookup is under heavy stress. We're currently investigating the issue. We'll let you know once we manage to work it out. Cheers, Pablo On Wed, Feb 29, 2012 at 9:20 AM, Nemanja Vukosavljevic < > wrote:" "Lists of values in Wikipedia infoboxes" "uFor a while now, I and other editors have been encouraging the use of HTML lists, emitted by templates {{Flatlist}}, {{Plainlist}} and {{Unbulleted list}} [1-3], which I introduced, for the inclusion of multiple values in infoboxes. The primary purpose of this is to enhance the accessibility of such content for people with visual impairments, who browse the website using screen readers. However, an additional benefit is that data reusers, including DBedia, should be able to recognise such values more easily than coma or line-break separated lists. For instance, \"Hamstead, London\" with a comma, or a line break, might refer to two distinct places, or one within an other. Marked up as either one list item, or two, the intended meaning will be clear. Naturally, it will take a long while for this to be ubiquitous on Wikiepdia, and there will always be some exceptions, but I thought it time I gave you a heads-up. Please let me know if and when you start to make use of this feature, or of you have any concerns, comments or questions. [1] [1a - example] [2] [2b - example] [3] Template:Unbulleted_list uHey Andy, Thanks for letting us know! This is great! I wonder if one of the GSoC students would be able to implement an extractor for that? Cheers, Pablo On Sun, Sep 1, 2013 at 10:06 AM, Andy Mabbett < >wrote: uOn 1 September 2013 18:06, Andy Mabbett < > wrote: I should also have included {{Hlist}}: Template:Hlist" "Statistics on raw wikipedia infobox usage" "uHello everyone, Are there any tools or at least guidelines which can be useful to get some descriptional statistics for particular Wikipedia dump with regard to categories size and infoboxes & their properties usage? I mean if I am going to add more mappings for Russian DBPedia it would be worth to know the following information to make mappings which will yield more data: 1) what are the biggest categories with respect to the articles number (including subcategories), 2) which infobox templates are used more often inside particular category, 3) which infobox properties are usually filled, 4) and so on. I guess that this information can be derived by quering against following DBPedia datasets: Raw Infobox Properties, Articles Categories,Categories (Labels),Categories (Skos). But is there some better (or simpler) way do that? uΣτις 19 Οκτ 2012 12:55 μ.μ., ο χρήστης \"Rinat Gareev\" < > έγραψε: some descriptional statistics for particular Wikipedia dump with regard to categories size and infoboxes & their properties usage? worth to know the following information to make mappings which will yield more data: (including subcategories), DBPedia datasets: You can use these for more advanced statistics but, this should do for your case mappings.dbpedia.org/server/statistics/ru/ Best Dimitris uHi Rinat, On 10/19/2012 11:54 AM, Rinat Gareev wrote: uThank you! Is it possible to deploy \"Mapping Statistics tool\" locally? Is it open-sourced somewhere? uIf it helps, this link contains some directions Best, Dimitris On Wed, Oct 24, 2012 at 1:12 PM, Rinat Gareev < >wrote: u0€ *†H†÷  €0€1 0 +" "Extraction framework debugging with IntelliJ IDEA" "uHi all, are there any guideline on how to debug the extraction framework using IntelliJ IDEA? Cannot find any document nor reference. Many thanks Andrea Hi all, are there any guideline on how to debug the extraction framework using IntelliJ IDEA? Cannot find any document nor reference. Many thanks Andrea uHi Andrea, On 01/21/2013 06:22 PM, Andrea Di Menna wrote: It's simple, you should first install the Scala plugin for IntelliJ [1]. This article also explains debugging with IntelliJ [2] [1] [2] compiler.html" "arabic dbpedia" "uhi all the Arabic dbpedia was published in January of this year, i don't found it in this URL ar.dbpedia.org, there is other URL? what news of this project? cordially hi all the Arabic dbpedia was published in January of this year, i don't found it in this URL   ar . dbpedia .org , there is other URL? what news of this project? cordiallyÂ" "Gold Snapshot of :BaseKB a free download on BitTorrent" "uFor a long time I've heard feedback from people who find it challenging to download large files such as the Freebase data dump and :BaseKB. That's why Gold Snapshots of :BaseKB are now available via BitTorrent By simply loading a small torrent file into a program like utorrent or Tranmission, you can efficiently and reliability get a copy of :BaseKB without the risk of data corruption. We plan to release Gold Snapshots on a quarterly basis and to indefinitely retain the data files. This is a big plus for academics, who can use Gold Snapshots for research and expect that others will be able to reproduce their results. It's also great for people who don't want to deal with the hassle of getting an AWS key or dealing with weekly updates. Since most Torrent clients provide the option to select which files you download, you can further speed things up by selecting only the files you need for a particular project. If you need access to Freebase data in the past week, you can still access this data on a requester-paid basis in AWS, and, better yet, access data in S3 directly with Amazon Elastic Map Reduce for parallel processing. :BaseKB data is compatible with industry standard triple stores; recently I found it wasn't only possible, but it was easy to load :BaseKB into Virtuoso 7.1 :BaseKB contains all relevant data from the Freebase RDF dump, but subtracts large amounts of repetitive and irrelevant information, corrects problems with literal formats, and subdivides the dump into portions such that a working database can be easily half the size of the complete Freebase RDF dump. Put this together with advances in triple store performance from the past two years, and now anybody who wants to work with Freebase data in a triple store can do so. Show trimmed content" "travel wiki dataset" "uHi all, Inspired by what dbpedia does with wikipedia, I wanted to try creating a similar dataset for wikitravel.org. Before going through how dbpedia does the same for wikipedia, I wanted to know if the solution used by dbpedia could be easily transferrable to this project or a completely fresh approach should be used. Looks like the dataset created for this would be much smaller than for dbpedia, thus I want to make sure I use the right approach. All opinions and inputs are highly appreciated. Thanks, ~Amulya Hi all, Inspired by what dbpedia does with wikipedia, I wanted to try creating a similar dataset for wikitravel.org . Before going through how dbpedia does the same for wikipedia, I wanted to know if the solution used by dbpedia could be easily transferrable to this project or a completely fresh approach should be used. Looks like the dataset created for this would be much smaller than for dbpedia, thus I want to make sure I use the right approach. All opinions and inputs are highly appreciated. Thanks, ~Amulya uHi Amulya, here is my educated guess: Yes, there are quite a few structure, which you can extract with the DBpedia framework. You will get infoboxes and some other data. It is definitely worth to try it. We are currently working on extensions for other Wikis, especially converting Wiktionary to RDF with a generic extractor. Please look here: These are xml configurations for the German and the English Wiktionary. I also cc'ed Jonas who develops the plugin. We would be happy to provide downloads for the data you produce Tell me if you need sourceforge access to branch the mercurial repo. All the best, Sebastian On 02/29/2012 08:56 AM, amulya rattan wrote:" "Category URIs broken" "uKingsley, Zdravko, It turns out that all DBpedia category URIs are broken since the server move from FU to OpenLink. Example: $ curl -I HTTP/1.1 303 See Other Server: Virtuoso/05.00.3023 (Solaris) x86_64-sun-solaris2.10-64 VDB Connection: close Content-Type: text/html; charset=UTF-8 Date: Tue, 19 Feb 2008 18:23:00 GMT Accept-Ranges: bytes Location: /page/Category%3ABerlin Content-Length: 0 Note that the redirect goes from to . The \":\" character has been %-encoded somewhere in the process. This sends the client into a 404 because does not exist. It should be . The problem is only in the redirect; the target at works fine. I don't know why this happens. (Just one thing to look into: Pubby has a configuration option \"conf:fixUnescapedCharacters\", whose value is a list of characters to be %-encoded in the redirect. We needed this option to work around a quirk in Apache in reverse proxy mode. But the Pubby config file I've sent you doesn't have the \":\" character in the list, so I don't know if this configuration option has anything to do with it.) Can you please look into this? Richard uRichard Cyganiak wrote: Richard, The current DBpedia 3.0 release is in \"hot staging\" mode right now. Thus, if it is too broken we will revert back to the prior release and then re-stage DBpedia 3.0 on a different server for further testing. I will give the current instance another 24-48 hours re. testing and fixing before reverting back if need be. uKingsley, On 19 Feb 2008, at 20:48, Kingsley Idehen wrote: I believe the issue above is unrelated to the dataset, but caused by some Pubby/Virtuoso interaction and has most likely been present ever since we moved servers from Berlin to Burlington. If possible, let's push ahead and fix the last remaining glitches. I also think it's acceptable to leave this particular issue unresolved for a few days, it shouldn't affect too many users and is not highest priority. I would prefer this over having to revert to an older version or server. Richard uRichard Cyganiak wrote: Okay." "Newbie questions: Sparql and dbpedia" "uHi All, I'm still learning and playing with dbpedia, sparql, and RDF, and trying to form a complete picture of all the ways to browse and query RDF content and dbpedia. I've got some questions to ask the group One thing I struggled with is names of specific predicates and objects in queries. I learned that subjects are Wikipedia page names. At least for infoboxes, predicates seem to be property names and objects the property values. Where can I go to discover other subjects, predicates, and objects (categories, geonames, people, etc.)? Is there any potential value in an RDFS/owl ontology to create a directory of these? So how are other datasets (besides infoboxes) actually integrated into dbpedia? I assume they're just laid in the database; no explicit linkages other than what consistencies there happen to be between the corresponding vocabularies. Any general statistics on how consistent terms are (like between places in Wikipedia and geonames)? Are such statistics feasible? I've found it pretty easy to develop queries using the Leipzig query tool translate these statements into sparql queries. Are there simple rules for doing this? What PREFIXs are predefined for this tool? Thanks, Rich uRich Knopman wrote: All, I have placed some sample .rq (SPARQL Query Definitions Only) and .isparql (Dynamic Data Pages that execute saved SPARQL Queries) at: I have also created a simple screencast that shows how to use the iSPARQL Query Builder at: YouTube: am still trying to figure YouTube out etc) My Personal Data Space Server: (* I suggest you WGET of \"Save Link As\" instead of just clicking on this movie file). I hope this helps everyone when using the SPARQL Query By Example builder at: you interact with the results of your queries via the Query By Example (QBE) Tab you will see SPARQL generated for you on the fly. Kingsley" "Ampersand in dbpedia returned URI breakingJena code" "uMarvin, yes, it's a bug in our dataset. In particular in the Yago dataset, which has been contributed externally and wasn't created with the DBpedia framework (but hey, we've got many similar bugs in datasets created by our framework ;)) Yago URIs have not been url-encoded. So as a workaround, you can url_encode all URIs starting with yago_en.nt file before loading it into your Jena model. That should do it. And we'll fix that bug for the future. Best, Georgi uGeorgi, Thanks for the reply! The problem is that loading dbpedia in an RDF store takes close to 40 hours (some RDF stores will even break), therefore I am using the DBPedia virtuoso server for now. Can you think of another solution? Thanks again, Marv uMarv, I had the same problem. You might try Openlink Virtuoso[1] as RDF Store. It performs very well. (I loaded almost the complete DBpedia datasets in just a couple of hours (on a fast server)) If you want to try it, I can send you some Unix scripts, my ini file and some hints & links. But then you could also quickly apply a hack like this: try{ QuerySolution soln = results.nextSolution() ; String x = soln.get(\"Concept\").toString(); System.out.print(x +\"\n\"); }catch (Exception e) {} which will result in a running script, but has some missing triples. Alternatively, you could retrieve the XML via http with java and then fix the XML and use ResultSetFactory.fromXML(String str), which shouldn't be so much work (not sure). I hope I could give you some options, Sebastian Hellmann [1] virtuoso uMarvin, Kingsley, On 20 Aug 2008, at 00:16, Georgi Kobilarov wrote: Actually, no. It's a bug in Virtuoso's SPARQL+XML result format serializer. Ampersands are allowed in URIs, so the Yago URIs are perfectly fine according to all the specs. (We *might* still want to %-encode the ampersand in those URIs, but just for consistency with our other URIs, not because the specs require it. That's a separate question.) The problem is: When a \"&\" character is included in content inside an XML file, it has to be written as \"&\". Virtuoso does not do this, hence the breakage. (This is a silly bug. The need to encode reserved characters (& and \") is just about the first thing a developer learns about XML. I hope OpenLink fixes this soon. Kingsley?) Richard u uRichard Cyganiak wrote: Richard, We will look into this matter. If it's solely our issue, then of course it will be fixed quickly. Kingsley uSeaborne, Andy wrote: uSeaborne, Andy wrote: uRichard Cyganiak wrote: uKingsley, from looking at the query result it seems like the issue is fixed. Thanks! Confirmation from someone who uses Jena to access the SPARQL endpoint would be nice. Richard On 20 Aug 2008, at 19:30, Kingsley Idehen wrote: uKingsley, et al Yes seems fixed, Jena code works correctly now on the query (without a filter). So all is fine! Thanks to everyone for the prompt responses! By the way, what was the fix? Is it something you changed in Virtuoso? Just curious. Marv uMarvin Lugair wrote: Yes, the XML based results serialization part of the SPARQL protocol. Don't even quite understand how this bug came to be in the first place, let alone how it's remained in place so long :-( Kingsley" "SPARQL Endpoints returns 500 Internal Server Error on some requests" "uHi all I've noticed an odd issue with the DBPedia SPARQL endpoint which did not previously exist. Given the following example query: PREFIX rdfs: SELECT * WHERE {?s a rdfs:Class } LIMIT 50 I send the query to DBPedia using a HTTP client with the following accept header: application/sparql-results+xml,application/sparql-results+json;q=1.0 And it comes back fine. But if I use a broader accept header it returns a 500 response. The accept header in question is as follows: application/rdf+xml,text/xml,text/n3,text/rdf+n3,text/turtle,application/x-turtle,application/turtle,text/plain,application/x-ntriples,application/json,text/json,application/sparql-results+xml,application/sparql-results+json;q=0.9,*/*;q=0.8 This used to work fine as this is code integrated in an application which I have tested without issue in the past. The reason I want to send this broader header is that I don't know ahead of time whether the query I'm sending is a query that returns a result set (i.e. ASK/SELECT) or a query that returns a graph (i.e. CONSTRUCT/DESCRIBE) Changing the Accept Header to */* is a possible option on my side (though not ideal as I want to limit my accepted content types to those for which I have parsers) but there does seem to be an issue at DBPedia's end? What is the problem and can this be fixed at your end? I would prefer not to have to parse the query locally and then send a more limited header as that procludes the user entering queries containing custom SPARQL extensions that my parser won't understand Rob" "Abstract parsing error" "uI'm using the AbstractExtractor to grab abstract from wiki pages. It incorrectly parses I get this whole text block as one TextNode rather than a TextNode, SectionNode, TextNode. \" has come to mean a person's principal weakness. ==Etymology==Achilles' name can be analyzed as a combination of \"" "shortabstract_en.nt: character encoding?" "uHi all. A question about character encoding in shortabstract_en.nt , for example \"\u010C\u00E1raj\u00E1vri is a lake in the municipality of Kautokeino-Guovdageaidnu in Finnmark county, Norway.\"@en . How can %C4%8C be decoded? Obviously it's not Unicode. (As a side note: I would really like a UTF-8 only version of all dbpedia files - I know some tools need the above \"tricks\", but ) Greetings Sven uHallo, Sven Hartrumpf schrieb: That is URL encoding. There should be a urldecode() method available for your programming language to reverse the encoding process. Kind regards, Jens uHello Jens. Thanks for your answer! I should have spent some more details here: If I url-decode the above, I don't know what the result should be. UTF-8? Sven uOn 25 May 2009, at 14:30, Sven Hartrumpf wrote: Yes. The byte sequence that you get after decoding the %-encoding is to be turned into a character sequence by using UTF-8. Best, Richard uMon, 25 May 2009 17:18:19 +0100, richard wrote: resource/Äárajávri Invalid UTF-8 code encountered at line 0, character 9, byte 9. The sequence is not a valid UTF-8 character because the first byte, value 0xC4, bit pattern 11000100, requires 1 continuation bytes, but of the immediately following bytes, byte 1, value 0xC3, bit pattern 11000100 is not a valid continuation byte, since its high bits are not 10. uOn 26 May 2009, at 07:40, Sven Hartrumpf wrote: Use proper tools. The continuation byte after 0xC4 is 0x8C, not 0xC3. This is plainly obvious from looking at the original %-encoded string. 0xC4 0x8C in binary is 11000100 10001100, the payload bits are 00100 001100 (see [1] for handy table), which in hex is 0x10C, which according to [2] is LATIN CAPITAL LETTER C WITH CARON: \"Č\". The entire string is \"Čárajávri\", which I figured out simply by copy- pasting the original URI into my browser's URL bar and hitting ENTER. In general, don't pass unicode characters through the shell. This will just mess things up. Store them in a file, open it in your web browser, and try different options from the \"View -> Character Encoding\" menu to understand what's going on. Best, Richard [1] [2] uThanks Richard for your patience and explanations. Yes, It seems that the shell (bash) does not play nicely with UTF-8 strings, at least in my environment :-( So, I will use the alternatives you described. Sven" "Java exception" "uHello everybody, I tried today to access this page [1] to say unmapped properties for the template film in Arabic , but I got a java exception saying : Error Exception: java.lang.IllegalArgumentException: Could not find template: فيلم Stacktrace: org.dbpedia.extraction.server.resources.PropertyStatistics.getMappingStats(PropertyStatistics.scala:147) org.dbpedia.extraction.server.resources.PropertyStatistics.get(PropertyStatistics.scala:42) sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:601) com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1483) com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1414) com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1363) com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1353) com.sun.jersey.server.impl.container.httpserver.HttpHandlerContainer.handle(HttpHandlerContainer.java:191) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:77) sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:80) sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:668) com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:77) sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:640) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) java.lang.Thread.run(Thread.java:722) May you please solve this problem from your side. Another thing, I can't access this page [2], when I visit it, it doesn't show anything at all, what's wrong please ? Best regards; Ahmed. [1] [2] uHi Ahmed, it means that the template you check has been created after the statistics generation time (around May-June 2012) and does not exists Cheers, Dimitris On Thu, Feb 14, 2013 at 9:00 PM, Ahmed Ktob < > wrote: uI'll try to re-generate the statistics for all languages soon-ish. Sorry for the problems. On Fri, Feb 15, 2013 at 11:21 AM, Dimitris Kontokostas < > wrote: uHello; Thank you for your answers, I am waiting for the new statisticsgood luck. Best regards; Ahmed. On 15 February 2013 15:02, Jona Christopher Sahnwaldt < >wrote:" "Virtuoso 37000 error related to partitioning" "uHi DBpedians, with some SPARQL queries, such as [1], I get an error which has something to do with partitioning: Virtuoso 37000 Error CL: SQL internal error with partitioned grou or order reader. Can do dbf_set ('enable_setp_partition', 0) to disable the feature, which will remove this error. The same query seems to work with other entities, so it is obviously sane in itself. Any suggestions? Best, Heiko [1] [2] sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query;=SELECT+%3Fx+COUNT%28%3Fs%29+AS+%3Fc0+WHERE+{+%0D%0A%09{SELECT+%3Fx+WHERE+{{%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FAngela_Merkel%3E+%3Fy0+%3Fz0.+%3Fz0+%3Fy1+%3Fx}+UNION+{%3Fx+%3Fy0+%3Fz0.+%3Fz0+%3Fy1+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FAngela_Merkel%3E}}}%0D%0A%09%3Fx+dcterms%3Asubject+%3Fs+.+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FAngela_Merkel%3E+dcterms%3Asubject+%3Fs}+ORDER+BY+DESC%28%3Fc0%29&format;=text%2Fhtml&timeout;=0&debug;=on uHi Heiko, On 12/17/2012 01:03 PM, Heiko Paulheim wrote: You can use endpoint [1], as described in that thread [2]. You can also use the DBpedia-Live endpoint [3]. [1] [2] [3] sparql" "issues with iri encoding in linked data frontend" "uhello dbpedia developers, the main dbpedia frontend has recently (between september[1] and december) started having issues with articles with umlauts or other non-ascii characters in it. requests like fetching data about Austria (Österreich) like curl -H 'accept: text/turtle' -v leading to curl (which look encoded correctly as per [2], given that http can't transport IRIs) give the following output lines modulo boilerplate: owl:sameAs . owl:sameAs dbr:Österreich . the first output line is tautological, but so what; the second is important information as per [3], but following one's nose stops there for lack of a way to dereference dbr:Österreich that was not already tried. this is especially impractical in foreign language wikipedias (german dbpedia has been having issues on its own[1]), and areas where funny characters can occur in titles like movies (eg. [4], [5]). i failed to find the source of the dbpedia component that is acutually serving the linked data; my suggestion is that it should serve the iri's statements on the canonical (as per [2]) address. the two currently shipped lines should probably stay to allow identification independent of whether the application tried to fetch it as an IRI or a URI. thanks for your work and your consideration chrysn ps. please cc me in replies [1] [2] [3] [4] [5] u[forgot to include the list] On Tue, Jan 3, 2017 at 2:24 PM, Dimitris Kontokostas < > wrote:" "Information Extraction using DBpedia" "uHi, I have a simple IE task. Simply want to distinguish between \"PERSON\", \"LOCATION\" and \"ORGANIZATION\" concepts. It means that I have a DBpedia's URI, what is the type (\"PER\", \"LOC\" and \"ORG\") of this resource? Using the following query I can get resource's type: * SELECT * WHERE { a ?o } but the output contains different labels (such as \"Settlement\", \"IranianProvincialCapitals\" and etc.). I don't know how to reason from this output? Currently, I have a lot of \"ifthen\" conditions which test if the output contains \"place\" (for example) or not: * if (tobject.contains(\"place\") || tobject.contains(\"locations\")                         || tobject.contains(\"ProtectedArea\")                         || tobject.contains(\"SkiArea\")                         || tobject.contains(\"WineRegion\")                         || tobject.contains(\"WorldHeritageSite\")                         || )                 It's really not a good way. These classes are structured in an ontological manner. Would you please help me to construct a query to \"reason\" the type of each resource (\"PER\", \"LOC\" and \"ORG\") in DBpedia? Hi, I have a simple IE task. Simply want to distinguish between \"PERSON\", \"LOCATION\" and \"ORGANIZATION\" concepts. It means that I have a DBpedia's URI, what is the type (\"PER\", \"LOC\" and \"ORG\") of this resource? Using the following query I can get resource's type: SELECT * WHERE { < || tobject.contains(\"ProtectedArea\") || tobject.contains(\"SkiArea\") || tobject.contains(\"WineRegion\") || tobject.contains(\"WorldHeritageSite\") || ) It's really not a good way. These classes are structured in an ontological manner. Would you please help me to construct a query to \"reason\" the type of each resource (\"PER\", \"LOC\" and \"ORG\") in DBpedia? uHi Amir, On 02/27/2013 05:56 PM, Amir Hossein Jadidinejad wrote: Does the following query do what you want: SELECT * WHERE { dbpedia:Berlin a ?o. FILTER(?o LIKE ) } uHi Amir, The reasoning you want is the classic deductive reasoning using classes and subclasses. Settlement is defined as a subclass of Place (although maybe not directly). That means that all Things that are Settlements are also Places. Tehran is a Settlement, so it is also a Place. If you want to see whether some Thing is a Place, you should look at the rdf:type and reason your way up via rdfs:subClassOf and see if you end up at Place. The class Place in DBpedia has URI . DBpedia has an option \"transitive\", that can be used to make subclasses of subclasses match as well. I'm not sure that that option is part of SPARQL, so this option may not work everywhere. To select 100 items that are in a subclass of Place: select distinct ?Concept where {?Concept a ?p . ?p rdfs:subClassOf dbpedia-owl:Place OPTION (transitive).} LIMIT 100 With SPARQL ASK you can ask whether there is a match. Is dbpedia:Tehran in a subclass of Place? (DBpedia says \"true\") ASK {dbpedia:Tehran a ?p . ?p rdfs:subClassOf dbpedia-owl:Place OPTION (transitive). } But in case you're looking for something that is only defined as a Place and not as a subclass of Place, you need to know whether Place is a subclass of Place. ASK { dbpedia-owl:Place rdfs:subClassOf dbpedia-owl:Place OPTION (transitive). } says \"false\". So you want to ASK if some Thing is a Place or a subclass of a Place. ASK { { ?thing a ?p . ?p rdfs:subClassOf dbpedia-owl:Place OPTION (transitive). } UNION { ?thing a dbpedia-owl:Place . } } Replace ?thing by the URI of the Thing you want to check. I think you can construct the queries for Person and Organisation yourself :) Good luck! Ben On 27 February 2013 17:56, Amir Hossein Jadidinejad < > wrote: uOn 2/27/13 5:28 PM, Ben Companjen wrote: Adding to the above, some live examples: 1. GMXCVM4 uIt's great. But I have a problem. Some instances such as: \" Is it posiible to change the following query to manage all locations? ASK {   {     ?thing a ?p .     ?p rdfs:subClassOf dbpedia-owl:Place OPTION (transitive).   } UNION   {     ?thing a dbpedia-owl:Place .   } } uDuck and cover! I sense a deep discussion about to start on whether an ocean is a place, a body of water, both of those things or neither of those things. :) What about avoiding the ontological discussion and just searching for things that have geolocation? Cheers Pablo On Fri, Mar 1, 2013 at 9:53 AM, Amir Hossein Jadidinejad < > wrote: uOn Fri, Mar 1, 2013 at 9:53 AM, Amir Hossein Jadidinejad < > wrote: That's because have an Infobox, just a set of coordinates. You could try looking for all resources with rdf:type resource has coordinates). On the other hand, there will probably be some places that don't have coordinates and thus don't have this type, so maybe you should look for resources that have type dbpedia-owl:Place OR gml:_Feature . JC" "Live and historic, help" "uDear community, I work at INRIA where we host the french dbpedia chapter, I'm the maintainer. We also want to deploy a dbpedia live. I think we need an OAI key but I don’t know how to obtain it. So any help will be great please. Also I work on a script for extract data from historic dump, and it can extract data like the number of revision by article, the comment and identifier of contributor for an revision, and the date of the first and last revision of an article. Actualy it's a nodejs script with cluster lib. If you are interested, we want to share it when it's finished, but I wish I knew what is for you the best way for reutilisability. I can try to recode the functionality with scala for an extractor in the extractor-framework, but I don't know if it can used for an article with many revisions and this kind of data. I can also push the code on a github. Or other ways.   With Regards Raphaël Boyer INRIA France WIMMICS Team Dear community, I work at INRIA where we host the french dbpedia chapter, I'm the maintainer. We also want to deploy a dbpedia live. I think we need an OAI key but I don’t know how to obtain it. So any help will be great please. Also I work on a script for extract data from historic dump, and it can extract data like the number of revision by article, the comment and identifier of contributor for an revision, and the date of the first and last revision of an article. Actualy it's a nodejs script with cluster lib. If you are interested, we want to share it when it's finished, but I wish I knew what is for you the best way for reutilisability. I can try to recode the functionality with scala for an extractor in the extractor-framework, but I don't know if it can used for an article with many revisions and this kind of data. I can also push the code on a github. Or other ways. With Regards Raphaël Boyer INRIA France WIMMICS Team uWelcome Raphael, Am 12.08.2015 um 12:02 schrieb Raphael Boyer < >: Most probably, Dimitris will help you out with the key. Did you have a look on the server module of the extraction framework [ You might consider participating in the monthly developers telco [ uHi Raphael, Dimitris and I had a long Discussion with Christophe Desclaux while ago, about configuring a DBpedia Live endpoint for the French chapter. As far as I know he managed to set up a live endpoint and received an OAI stream access. The endpoint is also running and extracting under: Could you clarify why you are requesting access again ? Cheers, Alexandru On Wed, Aug 12, 2015 at 1:25 PM, Magnus Knuth < > wrote: uOn Wed, Aug 12, 2015 at 3:48 PM, Alexandru Todor < > wrote: I was about to ask the same :) Cheers, Dimitris uHello Alexandru, Christophe Desclaux had done it for a private company in the context of a short project with the Ministery of Culture. We don't know if this endpoint will be maintained and we anticipate any discontinuity of service by deploying a new one at Inria. With regards Raphaël Boyer INRIA France WIMMICS Team" "Strange reset of the liveupdates counter" "uHi Mohamed, all, I just noticed weird behavior on the liveupdates site lastPublishedFile.txt was at 2011-09-21-11-000473. I made a change on a Wikipedia page to see if the change made it into our local synchronized DBpedia live. However, after saving the change, the counter in lastPublishedFile.txt was reset to 2011-09-21-11-000000 And all the files in overwritten. Is this the expected behavior or is something going wrong in the extraction process? Kind regards, Karel uHi Karel, First of all, sorry for my belated answer but I was so busy last days. On 09/21/2011 11:44 AM, karel braeckman wrote: We have fixed some bugs, including the one you have referred to in your mail (the variable names in the .removed files are repeated), and deployed the new framework to our server. So, we had to restart/stop the framework for a while, so that problem occurred but now everything is back to work and the problem vanishes." "query not returning results if inside SERVICE" "uany idea why this is not returning anything? SELECT * WHERE { SERVICE { ?aus < } } this is working fine, when issued directly at the sparql endpoint ( SELECT * WHERE { ?aus < } wkr turnguard | Jürgen Jakobitsch, | Software Developer | Semantic Web Company GmbH | Mariahilfer Straße 70 / Neubaugasse 1, Top 8 | A - 1070 Wien, Austria | Mob +43 676 62 12 710 | Fax +43.1.402 12 35 - 22 COMPANY INFORMATION | web : | foaf : PERSONAL INFORMATION | web : | foaf : | g+ : | skype : jakobitsch-punkt | xmlns:tg = \"http://www.turnguard.com/turnguard#\" any idea why this is not returning anything? SELECT * WHERE {  SERVICE < http://live.dbpedia.org/sparql > {    ?aus < http://dbpedia.org/ontology/country > < http://dbpedia.org/resource/Austria >;         < http://dbpedia.org/ontology/postalCode > ?remotePostCode  } } this is working fine, when issued directly at the sparql endpoint ( http://live.dbpedia.org/sparql ) SELECT * WHERE {    ?aus < http://dbpedia.org/ontology/country > < http://dbpedia.org/resource/Austria >;         < http://dbpedia.org/ontology/postalCode > ?remotePostCode } wkr turnguard | Jürgen Jakobitsch, | Software Developer | Semantic Web Company GmbH | Mariahilfer Straße 70 / Neubaugasse 1, Top 8 | A - 1070 Wien, Austria | Mob +43 676 62 12 710 | Fax +43.1.402 12 35 - 22 COMPANY INFORMATION | web       : http://www.semantic-web.at/ | foaf      : http://company.semantic-web.at/person/juergen_jakobitsch PERSONAL INFORMATION | web       : http://www.turnguard.com | foaf      : http://www.turnguard.com/turnguard | g+        : https://plus.google.com/111233759991616358206/posts | skype     : jakobitsch-punkt | xmlns:tg  = ' http://www.turnguard.com/turnguard# ' uthanks ;-) if you experience something similar remove \"default-graph-iri\" => it is inserted in the resulting sparql query to live.dbpedia could be considered a virt-bug. wkr turnguard | Jürgen Jakobitsch, | Software Developer | Semantic Web Company GmbH | Mariahilfer Straße 70 / Neubaugasse 1, Top 8 | A - 1070 Wien, Austria | Mob +43 676 62 12 710 | Fax +43.1.402 12 35 - 22 COMPANY INFORMATION | web : | foaf : PERSONAL INFORMATION | web : | foaf : | g+ : | skype : jakobitsch-punkt | xmlns:tg = \" 2014-10-09 17:41 GMT+02:00 Jürgen Jakobitsch < >: uHi Jürgen, Federated queries don't work on any DBpedia Virtuoso endpoint (except the german one where I configured a manual workaround). This is due to a bug in virtuoso dating back to 2011 [1] I opened a Thread on the DBpedia-Developers list about this issue [2]. In the meantime you can use the German endpoint [3] and run your queries from there [4]. Cheers, Alexandru [1] [2] [3] [4] On 10/09/2014 05:41 PM, Jürgen Jakobitsch wrote: uhi, i knowi don't issue the query on a dbpedia endpoint, but on my local endpoint for the solution see my prev mail. wkr j | Jürgen Jakobitsch, | Software Developer | Semantic Web Company GmbH | Mariahilfer Straße 70 / Neubaugasse 1, Top 8 | A - 1070 Wien, Austria | Mob +43 676 62 12 710 | Fax +43.1.402 12 35 - 22 COMPANY INFORMATION | web : | foaf : PERSONAL INFORMATION | web : | foaf : | g+ : | skype : jakobitsch-punkt | xmlns:tg = \" 2014-10-09 18:51 GMT+02:00 Alexandru Todor < >: uI noticed, got your new email after I sent mine. I think it would still be nice to have federated queries enabled on the endpoints themselves. Cheers, Alexandru On 10/09/2014 06:56 PM, Jürgen Jakobitsch wrote: uOn 10/9/14 11:41 AM, Jürgen Jakobitsch wrote: Need SPARQL Protocol URLs re., discerning participating endpoints :) u0€ *†H†÷  €0€1 0 + uOn 10/9/14 11:50 AM, Jürgen Jakobitsch wrote: Need the SPARQL example for this problem. Default named graph IRIs can come into play, in a number of ways :) uOn 10/9/14 1:02 PM, Alexandru Todor wrote: Maybe, but for now, there are more than enough issues dealing with what constitutes a SPARQL endpoint on the Web re., uptime etc I am sure you are reading all the Linked Data Fragment commentaries that misunderstand (for the most part) the issues in play re. SPARQL endpoints and the Linked Open Data Cloud. There are also issues in regards to Identity and ACLs that will need to be factored in too. It was disabled for these reasons, from the onset." "Two fully funded PhD positions on Answering Questions using Web Data" "uFraunhofer IAIS is pleased to announce two PhD positions - fully-funded with the EU research project: “WDAqua: Answering questions using Web Data”, which has started in January 2015. Research Area: The project will undertake advanced fundamental and applied research into models, methods, and tools for data-driven question answering on the Web, spanning over a diverse range of areas and disciplines (data analytics, data mining, information retrieval, social computing, cloud computing, large-scale distributed computing, Linked Data, and Web science). Potential topics for a PhD dissertation include, but are not limited to: ● Design of a cloud-based system architecture for question answering (QA), extensible by plugins for all stages of the process of QA and Web data ● High-quality interpretation of voice input and natural language text as database queries for question answering. ● Leveraging Web Data for advanced entity disambiguation and contextualisation of queries given as natural language. ● Question answering methods using ecosystems of heterogeneous data sets (structured, unstructured, linked, stream-like, uncertain). Institution The about 200 employees of the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS; investigate and develop innovative systems for data analysis and information management. Specific areas of competence include information integration (represented by the IAIS department Organized Knowledge), big data (department Knowledge Discovery), and multimedia technologies (department NetMedia). Requirements: 1. Master Degree in Computer Science (or equivalent). 2. You must not have resided or worked for more than 12 months in Germany in the 3 years before starting to work. 3. Proficiency in spoken and written English. Proficiency in German is a plus but not required. 4. Proficiency in Programming languages like Java/Scala or JavaScript, and modern software engineering methodology. 5. Familiarity with Semantic Web technologies, Natural Language Processing, Speech Recognition, Indexing Technologies, Distributed Systems and Cloud Computing is an asset. As a successful candidate for this award, you will: 1. Spend the majority of your time at Fraunhofer IAIS, where you will research and write a dissertation leading to a PhD (awarded by the University of Bonn). 2. Have a minimum of two academic supervisors from the WDAqua project. 3. Receive a full salary and a support grant to attend conferences, summer schools, and other events related to your research each year. 4. Engage with other researchers and participate in the training program offered by the WDAqua project, including internships at other partners in the project. Further Information For further information, please see the WDAqua homepage at How to apply Applications should include a CV and a letter of motivation. Applicants should list two referees that may be contacted by the Department and are moreover invited to submit a research sample (publication or research paper). Applications will be evaluated on a rolling basis. For full consideration, please apply until 27.02.2015. Applications should be sent to Dr. Christoph Lange-Bever. E-Mail: Tel.: +49 2241/14-2428 uHi This project (i.s. WD Aqua) targets a broad topic (Question answering on Web of Data) encountering different challenges. In total 15 PhD students will work on that in a training network. But each student will be narrowed down on a specific challenge. We will target all aspects and challenges for launching a question answering system on Web of Data as follows: 1. Making datasets fit for question answering: - Semi-automatically cleaning and enriching datasets for question answering. - Data quality assessment. - Evaluate effectiveness and efficiency on big, real datasets. - Quality-driven dataset discovery & retrieval. 2. Translating questions into federated queries. - Query interpretation. - Query Disambiguation. - Query cleaning. - Formal query construction. - Answer type prediction. - … 3. Cloud-based open question answering architecture (OpenQA) for an open Web 4. Initial training in engineering enterprise-ready software. - Evaluate in two different domains of e-commerce how easily non-experts can compose application-specific QA pipelines using the wizard. 5. High-quality interpretation of voice input and natural language text as database queries For more information you can have a look at if you are interested in a specific topic, I would be happy to share more information with you. On Wed, Feb 11, 2015 at 7:02 PM, Igo R < > wrote:" "Property description" "uHello everybody, Is there a chance to find out the meaning for a certain property used in dbpedia?  Take for example:  dbpedia.org/page/Adam_Smith I found the property dbpprop:after of  dbpedia:Robert_Graham_of_Gartmore and I now want to understand the relationsship between both persons, but can't find any information for the property . 2nd question is, what nlp idea/algorithm is behind the extraction of the relation (from the plain text document)? Thanks a lot in advance. Robert Hello everybody, Is there a chance to find out the meaning for a certain property used in dbpedia? Take for example: dbpedia.org/page/Adam_Smith I found the property dbpprop:after of dbpedia:Robert_Graham_of_Gartmore and I now want to understand the relationsship between both persons, but can't find any information for the property . 2nd question is, what nlp idea/algorithm is behind the extraction of the relation (from the plain text document)? Thanks a lot in advance. Robert uHi Robert, dbpprop properties are a simple 1:1 mapping of Wikipedia Infobox templates properties, with no specific mapping nor special inference (I think there are only some checks on data types). On the contrary, dbpedia-owl properties are extracted from mappings of Infoboxes defined on mappings.dbpedia.org website and using the DBpedia extraction framework. Hope this clarifies a bit your doubts. I'm sure the other guys will give you more insights into the mechanism :) BR Andrea Il giorno 14/feb/2013 19:23, \"Robert Glass\" < > ha scritto: uHi Robert, On 02/14/2013 07:21 PM, Robert Glass wrote: You can find the answer of your question in that thread [1]. [1] msg04407.html uHi Robert, On 02/14/2013 07:21 PM, Robert Glass wrote: You can find the answer of your question in that thread [1]. [1] msg04407.html uDoes this mean, that if the community member doing the mapping between infobox/template property and the dbpedia ontology does not provide any descriptive information (e.g. The ones from the wikipedia template), there is no chance for interference? uHi all The thread referenced by Mohamed does not answers Robert's question in my opinion, it just explains that the way the DBpedia ontology is built, it cannot answers the question :) Many elements indeed lack the minimal features allowing them to be understandable and maybe reusable. At the very least, since there is not many formal constraints, and that's OK, every element shoud come with a natural language definition. \"after\" is a symptomatic example of what automatic creation of ontology can lead to. When de-referencing following in the n3 file : @prefix rdf: . @prefix dbpprop: . dbpprop:after rdf:type rdf:Property . @prefix owl: . dbpprop:after owl:sameAs dbpprop:after . @prefix rdfs: . dbpprop:after rdfs:label \"after\"@en . Reading this, what do learn? Nothing, or at least nothing else than what you could infer from the use of this property in data. It's a property, OK. It's identical to itself, OK (never understood the intent of such a tautological triple) Its label is \"after\", which is I guess automatically generated from the URI (or the other way round). The only way to figure this \"after\" is going to the description of dbpedia.org/page/Adam_Smith where you find dbp:Adam_Smith dbpprop:after dbp:Robert_Graham_of_Gartmore dbp:Adam_Smith dbpprop:after dbp:Walter_Campbell_of_Shawfield You have actually to go to back to the source to figure out that this represents a succession of people in an academic office, namely \"Rector of the University of Glasgow\", which Adam Smith occupied from 1787 to 1789, preceded by Robert Cunninghame-Grahame of Gartmore, and succeeded by Walter Campbell of Shawfield. The dbpedia extraction has trimmed all this original context information, resulting in completely meaningless triples. No wonder you can't figure the meaning of \"after\". Best regards Bernard 2013/2/14 Mohamed Morsey < >" "is there any sort of verification done on the data?" "uHi, I see a lot of data from DBPedia which is plainly wrong but could easily be deleted automatically. Dates like 30-02-1928 or values like \"}\" can easily be gotten rid of using scripts. Is there any work like that being done? The content of this e-mail (including any attachments hereto) is confidential and contains proprietary information of the sender. This e-mail is intended only for the use of the individual and entities listed above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication (and any information or attachment contained herein) is strictly prohibited. If you have received this communication in error, please notify us immediately by telephone or by e-mail and delete the message (including the attachments hereto) from your computer. Thank you!" "Downloading Links between DBpedia and Wikidata" "uDear DBpedians, I have noticed that in the most recent version of DBpedia, there are owl:sameAs links between DBpedia and Wikidata, e.g., owl:sameAs . However, I cannot find any file that collects all those links, e.g., in Could you please provide me with a pointer to those links? Thanks, Heiko uHi Markus, thanks for your quick reply! I have had a look at that folder. It contains the extraction for the Wikidata-as-DBpedia dataset, right? It seems like I could, in theory, reconstruct the DBpedia-to-Wikidata links by combining the two files and (The former links Wikidata-as-DBpedia to Wikidata, the latter links Wikidata-as-DBpedia to DBpedia) But I assume that the actual links, like the statement I posted below, are contained in some file which is loaded into the Virtuoso endpoint? Best, Heiko Am 10.01.2017 um 09:49 schrieb Markus Freudenberg: uHi Heiko, these links are included in the following file (along with the other DBpedia links) you can easily grep the wikidata links if you need only those On Tue, Jan 10, 2017 at 12:30 PM, Heiko Paulheim < > wrote: uThanks, Dimitris! Cheers, Heiko Am 10.01.2017 um 14:08 schrieb Dimitris Kontokostas:" "DBpedia ontology" "uI have looked briefly at the DBpedia ontology and it appears to leave a great deal to be desired in terms of what an ontology is best suited for: to carefully and precisely define the meanings of terms so that they can be automatically reasoned with by a computer, to accomplish useful tasks. I will be willing to spend some time to reorganize the ontology to make it more logically coherent, if (1) there are any others who are interested in making the ontology more sound and (2) if there is a process by which that can be done without a very long drawn-out debate. I think that the general notion of formalizing the content of the WikiPedia a a great idea, but to be useful it has to be done carefully. It is very easy, even for those with experience, to put logically inconsistent assertions into an ontology, and even easier to put in elements that are so underspecified that they are ambiguous to the point of being essentially useless for automated reasoning. The OWL reasoner can catch some things, but it is very limited, and unless a first-order reasoner is used one needs to be exceedingly careful about how one defines the relations. I am totally new to this list, and would appreciate pointers to previous posts discussing such issues related to the ontology. A quick scan of recent posts did not turn up anything relevant to this matter. Perhaps those who have been particularly active in building the ontology would be willing to discuss the matter by telephone? This could help educate me and bring me up to date quickly on what has been done here. Meanwhile, Merry Christmas and Happy Hanukkah and Happy New Year to the members of this group. Pat Patrick Cassidy MICRA Inc. 908-561-3416 uOn Mon, Dec 26, 2011 at 7:26 PM, Patrick Cassidy < > wrote: You could create an ontology as it \"should\" be or you can use an ontology which matches the practices and conventions used by the Wikipedia editors. The latter is going to be messy in many ways, but at least it'll have a large quantity of data to work with. Getting any use out of the former would require you convincing all Wikipedians to adhere to your strict conventions, which seems unlikely to me. Another way to approach this would be the MCC/CYC approach. It'll take billions of dollars and you'll need to wait many decades for them to finish, but at the end of it all I'm sure you'd have a perfectly consistent knowledge base. Tom uTom, Thanks for the feedback: [TM] > You could create an ontology as it \"should\" be or you The way an ontology *should be* is the way it will be most useful to those who intend to use it. That means, it should be comprehensible and acceptable to them. As languages go (an ontology serves as a logical language), that also means that it will be the sum of the inputs of those who use it, not something imposed by some external authority. The question that I have not been able to resolve in my brief look at the DBpedia site is, just how is it anticipated that the ontology will be used? Is there an application that uses it? The application that uses it will be the ultimate arbiter of how it \"should be\". I will much appreciate a reference to actual uses in applications, where I can see how it is used and whether additional precision may be useful. I am aware of how wary people (including myself) are of those who would want to impose some ontology or terminology on a community. There is a long history of such efforts. The common resistance to using a complex system devised by others (if something simpler seems to serve as well) is one of the reasons that CYC has not been more widely adopted. In general, a big reason for the lack of wide adoption of CYC (and other \"upper ontologies\") is that people will only make effort to use another system if they have examples of uses so that they are convinced it is worth the effort; - but all significant uses of CYC and SUMO are proprietary and details are not available to the public. But there are also many examples where people *do* make effort to learn a system devised elsewhere, including linguistic systems, when useful applications can be seen. It is common even among ontologists to say that people will prefer to use their own (language/databases/terminologies/ontologies) so that no one language/ontology/database will ever be adopted universally, but we have a fine example of just such adoption of a common language - English. If you go to an international conference, virtually everyone speaks English and presents in English if they want their contributions to be understood by the largest number of people; the motivation is sufficient for people to make the effort to learn the language. And that is where an ontology can serve any community, or the whole world - in any situation where the creators of knowledge want to share it - in a precise form suitable for automated reasoning - with the whole community, however large or small that community is. As I understand it, the community intended to be served by DBpedia is the whole world. That is very ambitious, but I feel certain from my own work that it is entirely feasible to create an ontology that will be suitable for that whole world community. It does take more effort than just automatically extracting triples from a data source, structured or unstructured. Such an ontology cannot be imposed from above, it has to grow from the needs and practices of the community that uses it. But it will benefit from the large amount of work already done building other ontologies. Much of the hard work has already been done. The problem with using extracted data triples *alone* as a representation of knowledge is that, except in carefully controlled systems, they have the same problems as natural language itself - the same term may be used with multiple meanings (ambiguity) or many terms may be used with the same meaning (polysemy). Using OWL is a good step, but OWL is only a simple *grammar* for representing knowledge. Communication requires a common *vocabulary* as well as a common grammar. Triple stores created without prior agreement on terminology may still be useful for some probabilistic reasoning purposes. Automated alignment of data from different sources mostly relies on string matching to identify terms that are likely to have the same meanings in different data stores. Reasoning with such databases can generate inferences that rank results by probabilities, and they can be sent to a human interpreter who has the final decision as to whether the inferences are meaningful or nonsensical (as in a Google search). The automated alignment methods I have seen (except in very narrowly constrained domains) tend to have no more than 60% accuracy for any one pair. Automated reasoning will have chains of inferences, and any chain more than one inference in length will likely result in a conclusion that is unlikely to be true - the longer the chain, the less likely an accurate result. So if automated inferencing on data is considered desirable, very high accuracy in the representation is necessary. The good news is that such high accuracy is in fact *practical* (not merely possible), if the proper approach is used. Although different groups and different communities will insist on using their own local terminology, accurate alignment among all groups is still possible if each local community translates its own data into the common language for use by others - who will then be able to use it even if they have no idea who created the information, or for what purpose. Triple stores created by a local group may be precise if the vocabulary is carefully controlled by common agreement. For larger communities, such as that served by WikiPedia, there is little chance of gaining agreement on a single common terminology for all terms. The latter qualification is crucial. What is actually needed is not wide agreement on a massive terminology of hundreds of thousands of terms, but only on a basic defining vocabularyof a few thousand terms that is sufficient to describe accurately any specialized concept one would want to define. Learning to use such a terminology (or an associated ontology) will be comparable in effort to developing a working knowledge of a second language. In effect, in any given community that generates data, it is necessary to have at least one person who is \"bilingual\" in the local terminology and in the common ontology. This is perfectly feasible, if one has the motivation. I have been concerned with this tactic for database interoperability for a number of years. A discussion of the issue is given in a recent paper: [[Obrst, Leo; Pat Cassidy. 2011. The Need for Ontologies: Bridging the Barriers of Terminology and Data Structures. Chapter 10 (pp. 99 - 124) in: Societal Challenges and Geoinformatics, Sinha, A. Krishna, David Arctur, Ian Jackson, and Linda Gundersen, edsPublication of a Memoir Volume. Geological Society of America (GSA). (available at: If there is any prospect or hope that the formalization of knowledge envisioned by the DBpedia project will ultimately be used for automated reasoning, it is important that effort be made at an early stage to be sure that there is a proper foundation for accurate representation and avoidance of ambiguity. I can help in this task, and will be happy to do so if others in the community are willing to make an effort to do the kind of careful work required. If, on the other hand, it is expected that only probabilistic information will be extracted from queries on the DBpedia database, suitable only for inspection by potential human users, then such care in formalization may not be required. But it would still be helpful, and wouldn't add a lot of work to what is being done. The main effort is in carefully specifying the meanings of the relations being used, to avoid ambiguity and duplication. [TM] The great advantage of a volunteer community is that it doesn't take a lot of time to get funding, and the expense is mostly born by the volunteers for their own interests and their own views of what may help the public. No funder can impose a set of requirements. We *can* have a perfectly consistent database, and the effort of getting agreement on the basic vocabularyis likely to be a great deal less than is commonly supposed, because that vocabulary is not very large. The work done on DBpedia thus far appears to me to be a good start. How to proceed from here depends on the ultimate goals. I am very interested in learning how this community views its future. Pat Patrick Cassidy MICRA Inc. 908-561-3416 uOn 12/28/2011 12:28 PM, Patrick Cassidy wrote: One trouble is that 'useful' depends on the use. Knowledge base A might be able to attain hyperprecision easily for task B, but only give the 75% accuracy that people settle for today for task C. From the viewpoint of task B, KB A is great. A \"reconciled\" database that gives 90% accuracy for both might be good enough to write a paper about but won't be commercially viable for either task B or C. Cyc inevitably gets mentioned out when the commonsense domain comes up. Even if one does accept the conventional wisdom that Cyc was a failure, the failure of Cyc doesn't mean that the commonsense domain is intractable any more than the thousands of failed attempts at flight before the Wright Brothers meant that airplanes were impossible. The adoption cost is a killer, however, and is one that will probably need to be radically reduced or eliminated if a Cyc-like product is to become mainstream. My take is that Freebase and DBpedia are an extensional approach to the commonsense domain (defining \"person\" with a long list of people and their attributes) rather than the intensional approach taken by Cyc and SUMO (does a person have two arms and two legs? is a person a member of the same species as Carl Linnaeus? is a corporation a person? In what sense is Frodo Baggins a person?) I think the extensional and intensional approaches will both be useful, but I think that computers will need a logical framework that covers everything about as much as people do. People generally don't need to think about cooking and quantum mechanics at the same time, and if somebody does, they'll invent their own framework. The only way a framework is going to be useful is if it is actually used, and a framework that \"supports\" oodles of hypothetical use cases that don't actually get used won't be usable for any of them. If that's the case, why don't you use SUMO, which is trying to accomplish exactly that? My answer would be that the vocabulary of a few thousand terms leaves you alone with the grounding problem. With a very large terminology (say the set of 3M dbpedia resources) you can, on the other hand, apply methods that work statistically, and even if you can't find the \"correct\" chain of inference you can find a large number of chains that support correct conclusions. The most direct criticism that can be made of Chomsky's linguistic program is that we've never been able to use it to transplant the \"language instinct\" into a computer. Yet. the \"language instinct\" is a facility that is part of an animal, and perhaps it needs to have an animal attached in order to work. Perhaps not necessarily a flesh and blood animal, but some kind of simulation of one. Mammals are quite good at commonsense reasoning, and if you know them well you'll discover that they're good at many of the things where Cyc tries to extend conventional logic-based systems. To make progress on \"language\" and \"vocabularies\" and such, I think it's necessary to step back and look at the primary process in which humans and animals do probabilistic commonsense inference about themselves, their environments and each other. Well, one trouble is that Dbpedia is not a database of people, places and things. It's a database of Wikipedia pages. I could let this bother me because I'm interested in the kind of ontology that the Greeks were interested in. What sort of things exist? Wikipedia's sense of what a \"thing\" is is the kind of thing that would drive any sensitive person nuts unless they decided to accept it as it is. For example, DBpedia has no concept that corresponds to \"special/exceptional\". It doesn't distinguish between \"Gingerbread\" and \"Gingerbread House\" but recognizes five or so senses of \"Gingerbread man\". It's rather hard to prove that a person or book is notable in Wikipedia but it seems impossible for a video game to not be notable. Every episode of \"Star Trek\" has its own Wikipedia page, but you'll find no episodes of \"General Hospital\". Sometimes it is hard to determine what exact \"thing\" some Wikipedia pages are about. Wikipedia's p.o.v. is not the consistent p.o.v. of an ontological engineer, but is the result of a battle between inclusionists and deletionists. It's approximately consistent because people fix obvious inconsistencies. DBpedia attains hyperprecision by being focused uJust a note on a couple of points raised by Paul Houle: [PC]>> What is actually needed is not wide agreement on a massive terminology of hundreds of thousands of terms would [PH] If that's the case, why don't you use SUMO, which is trying to accomplish exactly that? No, that was not the intended purpose of SUMO, a project in which I participated. SUMO could be used as a starter ontology to create the required inventory of semantic primitives, but would need supplementation. Closer to the required ontology is the COSMO ontology (see OpenCyc, SUMO, BFO, and DOLCE, along with other elements not contained in any of those. The COSMO has ontological representations of all of the words in the Longman dictionary defining vocabulary, which are used to create the dictionary definitions of all of the words in the Longman dictionary. Analogously, for any given set of specialized ontologies used in different applications, there will be some set of primitive elements sufficient to create the logical specifications of all of the elements in the domain ontologies. The COSMO is intended to contain all of those primitive elements, and to be supplemented as needed if new primitives are discovered. But I am assuming that the ontology for DBpedia will necessarily develop from the interests of the contributors. It takes effort to find relationships among different domains and to reconcile different viewpoints, but that task can be made more efficient, and the product more accurate by identifying the semantic primitives used in common. It appears, from what I have read about DBpedia thus far, that some effort has already been made to reconcile different terminologies in porting the infobox data into the common data store. That is a very good start, but examining the ontology shows that a lot more work is needed. One thing I am trying to learn is, exactly how is the DBpedia being used. Those uses will determine how the ontology might be modified to be more effective. [PH] >> The only way a Yes, and that is why I want to learn how the DBpedia ontology is being used. [PH] >> Wikipedia's p.o.v. is not the consistent p.o.v. of an ontological Yes, the Wikipedia itself may have inconsistencies, but it is intended for use by people who may be able to interpret the linguistic phrases in their proper context. The ontology, however, does not have to have that problem. Wherever there are inconsistent theories, they can be represented in the ontology as different theories, which do not have to be logically consistent (the CYC \"microtheories\" are one example). The base vocabulary of logical primitives, however, will be consistent, and the same vocabulary of primitives will be able to logically describethe different theories so that the differences will be precisely recognized. To give a trivial example, one can state proposition [A] and proposition [not A]. If \"A\" is definable by the inventory of semantic primitives, then both of these inconsistent theories can be represented by the same consistent ontology. The point here is that the DBpedia ontology canhave a logically consistent representation of all of the information in the Wikipedia infoboxes, and that information can be used for precise automated reasoning. It is not as difficult as it may appear to accomplish this, but it does require that one make the effort. If, however, none of the DBpedia users wants to use the DBpedia for precise reasoning, then the extra effort may be superfluous. That is why I would like to hear from people who are using the DBpedia ontology. [PH] >> My answer would be that the vocabulary of a few thousand terms You have a \"grounding problem\" regardless of the size of the vocabulary. Finding the defining primitives merely allows one to identify the minimum set of concepts that need physical grounding. Statistical methods do allow, in theory, arbitrarily precise results, provided that one has a near-infinite set of correlated examples for precisely those cases one wants to correlate. I don't believe that the structure of the Wikipedia or of DBpedia fit those criteria, but would be quite fascinated if any user has an example of such statistical usage. Pat Patrick Cassidy MICRA Inc. 908-561-3416 uIn the \"Use Cases\" page for DBpedia there is a statemtn: uOn 12/29/11 4:04 PM, Patrick Cassidy wrote: uHi Patrick, u0€ *†H†÷  €0€1 0 + uGerard, Thanks for the references. I wasn't able to access the Yago-SUMO page. Perhaps there is another source for it? I do have the OWL version of the DBpedia ontology, and that is what I have been looking at as an indicator of the state of the ontology work. I don't yet know the relation between Yago and the DBpedia ontology. I am not sure how well mappings would work - in my experience ontologies are so different that an attempt to use a \"same as\" link in another ontology would lead to a lot of incorrect inferences. It could still be quite useful for probabilistic searches. I am still in the process of trying to get acquainted with the DBpedia ontology, and aside from what looks like inconsistencies the structure of the ontology itself, there is a peculiar usage: Using the SPARQL query page, types of birds like \"Albatross\" ( ( ), i.e. the query: { ?e } returns a list of types of birds. I would expect these entries to be *subclasses* of Bird, rather than *instances* (the usual interpretation of the rdf:type relation). Is this an intentional variant usage of the notion of \"type\"? There are, in the type list, subtypes of Albatross, such as \" the first species of albatross to be described\" and should therefore be a subtype of Albatross, but since Albatross is not in the ontology, it is listed as having type \"Bird\" or \"Eukaryote\". I am aware that on occasion the distinction between an instance and subclass can be somewhat subtle - for example, for conceptual works, where there may be more than one version (software!!) it would seem more appropriate to consider such works as Classes (types) of thing, with the individual instances being the physical objects that embody the abstract work. But that is often not the usage. My interest is seeing to what extent the Dbpedia can be organized so as to be useful for accurate inference. The ontology at present is sufficiently small that this is probably quite feasible, if a reorganization does not break some existing application. So learning whether there are existing applications that depend on the current structure of the DBpedia ontology is one of the issues of primary concern to me in this regard. Pat Patrick Cassidy MICRA Inc. 908-561-3416 From: Kingsley Idehen [mailto: ] Sent: Friday, December 30, 2011 2:29 PM To: Subject: Re: [Dbpedia-discussion] DBpedia ontology On 12/30/11 5:18 AM, Gerard de Melo wrote: Hi Patrick, An approach to combine the advantages of both worlds is to interlink DBpedia with hand-crafted ontologies such as Open Cyc, SUMO, or Word Net, which enables applications to use the formal knowledge from these ontologies together with the instance data from DBpedia. uHi Kingsley, Adam Pease is going to publish update an updated SUMO.owl file with the new mappings in a few days. I'll email you when it is online. Best regards, Gerard uDear Patrick, The website appears to work for me. You can also read the following paper Gerard de Melo, Fabian Suchanek and Adam Pease (2008). Integrating YAGO into the Suggested Upper Merged Ontology Proceedings of IEEE ICTAI 2008. IEEE Computer Society, Los Alamitos, CA, USA. Technical Report version: DBpedia's ontology is extremely shallow with just 320 classes based on infobox types. Many Wikipedia pages do not have infoboxes and hence the corresponding instances lack any genuine class. The advantage of this approach is that it is fairly accurate. YAGO, in contrast provides several classes for almost all instances with a Wikipedia page, but it is a bit less accurate. The class hierarchy contains many thousands of classes derived from WordNet, which of course is suboptimal in some ways from an ontological perspective. The YAGO-SUMO project aims at replacing the WordNet upper-level of YAGO with one based on SUMO. There have been approaches published on how to recognize Wikipedia pages that describe classes rather than individual instances. So far, however, such work has not made its way into DBpedia, which treats every page as describing an instance. People frequently use DBpedia and YAGO as ontologically enhanced databases that allow for join queries of the sort \"Find me all companies founded in the Bay Area in the 1960s\". The ontology is used to make sure you match all relevant subclasses of COMPANY. The YAGO2 demo paper has a screenshot [1] showing how you can visualize the results. Best regards, Gerard [1] www2011demo.pdf uOn 12/31/2011 5:21 AM, Gerard de Melo wrote: Recall is poor but precision is excellent. I went looking in DBpedia 3.5 for topics that were mistyped as :Person and only found four. I can't say there aren't any others, but usually I find a lot more trouble when I go looking for it. Freebase used machine learning methods to find :Person(s) in their database and got about 2x better recall against people documented in Wikipedia than DBpedia does. I haven't tried quality checking against it, however. DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" On 12/31/2011 5:21 AM, Gerard de Melo wrote: DBpedia's ontology is extremely shallow with just 320 classes based on infobox types. Many Wikipedia pages do not have infoboxes and hence the corresponding instances lack any genuine class. The advantage of this approach is that it is fairly accurate. Recall is poor but precision is excellent. I went looking in DBpedia 3.5 for topics that were mistyped as :Person and only found four. I can't say there aren't any others, but usually I find a lot more trouble when I go looking for it. Freebase used machine learning methods to find :Person(s) in their database and got about 2x better recall against people documented in Wikipedia than DBpedia does. I haven't tried quality checking against it, however. u0€ *†H†÷  €0€1 0 + u0€ *†H†÷  €0€1 0 + uKingsley, Thanks for pointing out that example of use of the ontology. This may be a good example to discuss the effects of changing to \"type\" relation to \"subclass\" as it is used for the biological taxonomy. The type relation appears to be used in the same sense that the rdf:subclasOf is used in other OWL ontologies. In the usual usage, if X has type Y, then X is an individual in the class Y, not a subclass. But in the ontology and its associated applications, to determine the parent classes (at some levels) one apparently needs to use the \"type\" rather than subclass relation (i.e. \"Albatross\" is usually a subclass, not an instance of \"Bird\"). It also appears that the subclass relation is not propagated up the hierarchy, as it should be for a transitive relation. It is not clear whether these usages were simply a mistake that has propagated, or a variant definition of these relations that was intentionally adopted for some reason. Whatever the reason, I am thinking of the following experiment: Replace the \"type\" relation with \"subclassOf\" in the DBpedia mappings and ontology where it is appropriate, and add in triples that propagatge the \"subclassOf\" up the hierarchy. Then what would happen to the faceted browsing application? I am guessing that the application assumes that the \"type\" relation will be used in the sense of \"subclassOf\" in the ontology mappings, and the results would be quite different from what happens now - and I presume, not what is expected or intended. If this is the case, then the application would have to be modified to use \"subclassOf\" instead of \"type\", to get the intended results. Would this be a major effort to make this change? I start with this suggestion because the unexpected usage of \"type\" and \"subclassOf\" is one of the more striking (to me) examples of how the ontology appears to be in error. Since the hierarchical \"subclassOf\" relation is one of the most important relations in an ontology, I would hope that that relation be used accurately, before other aspects of the ontology are modified. Pat Patrick Cassidy MICRA Inc. 908-561-3416 From: Kingsley Idehen [mailto: ] Sent: Saturday, December 31, 2011 1:38 PM To: Patrick Cassidy Cc: Subject: Re: [Dbpedia-discussion] DBpedia ontology On 12/30/11 5:14 PM, Patrick Cassidy wrote: Gerard, Thanks for the references. I wasn't able to access the Yago-SUMO page. Perhaps there is another source for it? I do have the OWL version of the DBpedia ontology, and that is what I have been looking at as an indicator of the state of the ontology work. I don't yet know the relation between Yago and the DBpedia ontology. I am not sure how well mappings would work - in my experience ontologies are so different that an attempt to use a \"same as\" link in another ontology would lead to a lot of incorrect inferences. It could still be quite useful for probabilistic searches. I am still in the process of trying to get acquainted with the DBpedia ontology, and aside from what looks like inconsistencies the structure of the ontology itself, there is a peculiar usage: Using the SPARQL query page, types of birds like \"Albatross\" ( the query: { ?e } returns a list of types of birds. I would expect these entries to be *subclasses* of Bird, rather than *instances* (the usual interpretation of the rdf:type relation). Is this an intentional variant usage of the notion of \"type\"? There are, in the type list, subtypes of Albatross, such as \" the first species of albatross to be described\" and should therefore be a subtype of Albatross, but since Albatross is not in the ontology, it is listed as having type \"Bird\" or \"Eukaryote\". I am aware that on occasion the distinction between an instance and subclass can be somewhat subtle - for example, for conceptual works, where there may be more than one version (software!!) it would seem more appropriate to consider such works as Classes (types) of thing, with the individual instances being the physical objects that embody the abstract work. But that is often not the usage. My interest is seeing to what extent the Dbpedia can be organized so as to be useful for accurate inference. The ontology at present is sufficiently small that this is probably quite feasible, if a reorganization does not break some existing application. So learning whether there are existing applications that depend on the current structure of the DBpedia ontology is one of the issues of primary concern to me in this regard. Pat Patrick Cassidy MICRA Inc. 908-561-3416 Pat, There is a faceted browser associated with the following data spaces that hold DBpedia datasets as per these examples: 1. resource%2FWandering_Albatross uHello Pat, DBpedia is not an ontology - it is a dataset of instance data that partly uses an ontology created by the DBpedia team. The Instance data (ABox) has the namespace The Ontology (TBox) has the namespace I agree with you that it would be nice if many DBpedia entries would be classes with proper subclassing instead of individuals. But as DBpedia data is generated automatically from Wikipedia, I guess it would be quite difficult to keep this consistent. You are not the first one to notice these deficiencies in the use of DBpedia, see \"Frequently Asked Questions\" on Regards, Michael Brunnbauer On Sat, Dec 31, 2011 at 05:34:12PM -0500, Patrick Cassidy wrote: u0€ *†H†÷  €0€1 0 + u uMichael, Thanks for the reference and comment. That does answer part of the question I had. This still leaves me wondering whether the DBpedia ontology is being intentionally kept small (rather than representing all of the classes described in Wikipedia) for performance reasons, or because the effort of trying to make the ontology accurate while expanding it substantially is too large for the existing team. If the latter, perhaps I can help in that task. If there are performance issues, I would like to get more detail to see just how serious an expansion of the ontology would be for the existing usages. The other issue raised by the discussion in the FAQ at the GoodRelations page you cited is - just what should the semantics of a Wikipedia article be? At present, the \"type\" relation is used, and that make it appear odd for traditional ontology structure, when the Wikipedia article describes a class of things rather than an individual. It seems (to me) that it would be more appropriate to use a relation of the type \"isaDiscussionOf\" (where the main topic of the article is exactly the same as the intended class of the Wikipedia), or \"isaDiscussionOfaSubtypeOf\" where there is no corresponding DBpedia ontology entry for that class, but the class described can be related to a broader class in the ontology. The \"type\" relation might still be used, but to be consistent with the notion of Wikipedia articles being discussions rather than the individuals described it might be better to use a relation such as \"isDiscussionOfanInstanceOf\" to point to the class of the entity discussed. Then, the articles would all have the semantics of articles rather than of the entities described by the articles. I do recall someone (not sure who) mentioning that this issue of the proper semantics for articles has been an issue, but I don't recall in what context. I suppose it may well have been a hot topic for the DBpedia community at some time. I apologize that I am quite new to this community, so do not know the history. But unless there are performance issues that are truly insurmountable, it seems to me not to be difficult to get the semantics right so that the ontology can grow to accurately reflect properties and relations of the entities discussed by Wikipedia. And I will be willing to help in that task. Or is this a task that has consciously been left to the GoodRelations group? Pat Patrick Cassidy MICRA Inc. 908-561-3416 uOn 1/2/12 3:02 AM, Patrick Cassidy wrote: The ontology is deliberately small. The goal is for others to use it as basis for better ontologies as demonstrated by SUMO, Yago2, Cyc and other TBox mappings. At the end of the day, no ontology is a gospel ontology. Each provides specific context lenses into a given ABox. Kingsley uOn 1/2/12 2:13 AM, Patrick Cassidy wrote: uHi Kingsley, Adam has made an updated version of the SUMO OWL conversion [1] available. This doesn't include the results of your collaboration or of the upcoming new YAGO-SUMO release [2] yet, but it does contain a number of links to YAGO2 and and DBpedia. Best, Gerard [1] [2] yagosumo.html uOn 1/7/12 2:59 PM, Gerard de Melo wrote: Great! We've just reloaded Yago2 onto our LOD cloud cache [1]. We'll get this dataset loaded there too. Links: 1. http://lod.openlinksw.com uHi Andrea, I noticed that you do a lot of changes in the DBpedia ontology. I am sure you have your reasons for adding all these new classes and properties but I'd expecto to see at least a description (in the form of rdfs:comment) on the new items. Note that the DBpedia ontology does not aim to be an ontology for everything and before we introduce a new class/property we try to find if the same functionality can be served from existing ones. For instance, the new class JapaneseRailwayStation could be served from RailwayStation unless there exists so many railway stations in Japan that can justify its existence. Even so, we could conclude the location of a railway station from other location like properties ( country, region). You introduced Gymnast (which is an ancient greek word for coach) and Volleyball coach on the same level. Although I am Greek, I'd choose coach for naming :) and place volleyballCoach as a subclass (or maybe stick with just coach) You introduced a lot of soccer and football classes without any comments, I guess you know the difference between soccer and football in Europe and US You also made a lot of range/domain changes without explaining anything on the change log / talk pages I didn;t look much into the new properties but a few catchy ones I spotted are the new Mass* properties We already handle weight/mass a different way, you can look at for an example case. So you could delete these properties and and use this way intead Anyway, I didn't go into detail for every edit you did the last days but I'd try to be more carefull as every change you make affects the DBpedia in total both long term and immediatelly (through DBpedia live) Best, Dimitris uHi Andrea, I don't want any specific action from you, just use the above comments and review your edits so far. I am sure that most of them are already in the right direction Just note that if you move a page, you'll have to delete the created redirection, otherwise they will exist both in the ontology please see inline for your questions On Mon, Dec 24, 2012 at 2:30 PM, Andrea Di Menna < > wrote:" "Bring your Blog, Wiki, WebApp to the Semantic Web?!" "uHi all, DBpedia exposes semantics extracted from one of the largest information sources on the Web. But one of the nice things about the Web is the variety and wealth of content (including your Blog, Wiki, CMS or other WebApp). In order to make this large variety of small Websites better mashable and bring them on the Semantic Web the makers of DBpedia released technologies, which dramatically simplify the “semantification” of your Websites. Please check out Triplify [1] (a generic plugin for Webapps with preconfigurations for Drupal, Wordpress, WackoWiki, others are simple to create), D2RQ [2] (a Java software for mapping and serving relational DB content for the Semantic Web) and Virtuoso [3] (a comprehensive DB, knowledge store, linked data publication infrastructure). Sören [1] [2] [3]" "Missing results on DBPedia end point" "uHi Patrick, Thanks for checking this. Just to let you know, I followed Richard intuition and tried this, which indeed works: SELECT ?result WHERE { ?result ?date. FILTER(str(?date) = \"2006-09-13\") } Christophe" "Frequency of processing Wikipedia data dumps" "uHi, What is the frequency in which DBPedia extracts Wikipedia data dumps? This would help to understand the timeframe to which the current data available corresponds to. Also, does any one know how long does it take for write access to be granted to change mapping information once it is requested for? I have requested for write access yesterday. Thanks and regards, Venkatesh Channal Hi, What is the frequency in which DBPedia extracts Wikipedia data dumps? This would help to understand the timeframe to which the current data available corresponds to. Also, does any one know how long does it take for write access to be granted to change mapping information once it is requested for? I have requested for write access yesterday. Thanks and regards, Venkatesh Channal uHi Venkatesh, On 11/07/2012 03:55 PM, Venkatesh Channal wrote: There are approximately 2 DBpedia releases per year." ""Find in Sindice" link on DBpedia pages?" "uHi all, As some of you already know, I'm a member of the Sindice team at DERI Galway now, and I'm exploring possible synergies between DBpedia and Sindice. Here's the first idea. In the footer of each DBpedia page, there are currently a bunch of utility links: - As N3 - As RDF/XML - Browse in Disco - Browse in Tabulator - Browse in OpenLink Browser How do you feel about adding another one? - Find in Sindice Sindice ( documents that mention a certain URI. The \"Find in Sindice\" link would go to the results of a search for the current DBpedia resource's URI. The effect is that we can discover backlinks from other datasets into DBpedia. For example, if Revyu links into DBpedia, but DBpedia does not link into Revyu, then Sindice helps us to find this link from within the DBpedia page. At the moment, Sindice's coverage is still fairly narrow (just a few large datasets), so it won't produce too many useful results at this time, but this will improve over the next few months. What do you think? Any objections? Richard" "Errors in rdfs:label values for pages with :" "uI have been assuming the rdfs:label property for dbPedia entities are the text of the wiki titles. However, pages with \":\" in the name, such as Baggage (Law&Order;: Criminal Intent) or EyeToy: Play come through with the label having only the rhs of the :, e.g. rdfs:label \"Play\" . Looks like a bad URL conversion. -Chris uChris, It's a bug. This happens because of URIs like In this case, it makes sense to cut the Category: part to get a better label. Unfortunately we over-generalized I logged a bug. Thanks for reporting! Richard On 22 Oct 2007, at 17:42, Chris Welty wrote:" "A map of DBpedia 2016-04" "uHere is a fun visualization I made looking at how the datasets in DBpedia-2016-04 overlap. dbpedia-2016-04-overview.html uCool visualization Paul! I see you added the cited facts dataset there, nice :) btw, how are you dealing with the wkd_uris datasets and the named graph hierarchy? Cheers, Dimtiris On Mon, Oct 24, 2016 at 11:11 PM, Paul Houle < > wrote: uWhat I did to make that image was load each TTL file into its own named graph and then counted the triples in the intersections of the graphs. I'm not sure if it is worth loading all of the quad files since it seems most of them just link to the pageid of the wikipedia page that the data came from. Somebody might want that, but I think most people won't care. I did load the triples instead of the quads for the cited facts and I really did lose something because in that case the graph field contains the citation URL. It looks like the WKD datasets have been processed to rewrite DBpedia URIs to Wikidata IRIs. Is that right? What is the \"named graph hierarchy?\" uThanks Paul, On Tue, Oct 25, 2016 at 7:20 AM, Paul Houle < > wrote: I asked that because of the WKD uris dataset you put each file into its own graph which is good and all under a virtual named graph to make querying easier the wkd uris are in one way conflicting here because it is a rewrite of DBpedia uris with WKD IDs, maybe it makes snse to put these in a separate virtual graph (or remove them for the database because it is a duplication of the rest (and possibly subset in case not all pages have a WKD ID) Cheers, Dimitris" "filtering Stanford Named Entity Recognizer output spotlight or not" "u Hello, I am new to dbpedia. I'd like to filter output of Stanford Named Entity Recognizer by taking only those entities contained in Wikipedia. And I'd like to do it offline. Do I have to run complete clone of DBpedia on local Virtuoso? Or is it possible to use dbpedia-spotlight? If so, is there any documentation how to search dbpedia-spotlight indexes? I haven't found anything this.  Thanks for any answer, uHi Pavel, If a good solution for your problem is actually to check dbepdia-spotlight's index, you can easily do it: - you can download the dbpedia-spotlight model ( - download dbpedia-spotlight.jar ( Then you can put the jar as a dependency into a scala terminal and query the list of topics in their stores: just open a scala terminal and do: :load script.scala here is the script: (change the paths according to yours) On Wed, Apr 30, 2014 at 2:28 PM, < > wrote:" "T-box (DBpedia metamodel)" "uHi all! I am working on PhD project involving DBpedia. I need to have the DBpedia Schema (T-box). does anybody know where I can find it? Or if there isn't, does anybody know how to extract it starting from available N-Triple datasets ? The final result should be a file including an OWL Schema and all DBpedia's classes and property Thank you! uLe Sun, 16 Mar 2008 17:40:00 +0100, Davide a écrit : Grep around the extractors - dbpedia/extraction/extractors/ . Or try running this query (a DISTINCT won't work due to perf costs): 2Fdbpedia.org&query;=SELECT+%3Frel+WHERE+{[]+%3Frel+[]}+LIMIT+1000%0D% 0A&format;=text%2Fhtml&debug;=on That should be easy, but won't be as good as hand-crafted OWL schemas if you can write them (I don't think they exist for the DBPedia core). uLe Sun, 16 Mar 2008 17:18:23 +0000, cho a écrit : I notice that even the predicates are partly generated from infoboxes, so you'll have to be more precise in expressing what you are looking for: extraction logic? non-generated data? any kind of structure? uHi, You'll find the Properties from Infoboxes in this Dataset: Other Properties are \"extractor specific\", which can be found easily in the several extractors / datasets. Jörg" "DBpedia query result is different in Jena ARQ" "uI have executed this query on DBpedia sparql endpoint to get class hierarchy, the result was correct. But When I executed it from Jena ARQ the result were different. ElectronicsCompaniesOfFinland class have two super classes ElectronicsCompany108003035 and Company108058098. Where it has only one ( ElectronicsCompany108003035) in DBpedia endpoint result. Why result is different ? I have executed this query on DBpedia sparql endpoint to get class hierarchy, the result was correct. But When I executed it from Jena ARQ the result were different. ElectronicsCompaniesOfFinland class have two super classes ElectronicsCompany108003035 and Company108058098. Where it has only one ( ElectronicsCompany108003035) in DBpedia endpoint result. Why result is different ?" "Automating DBpedia queries with live data and abstracts" "uHello everyone! I have three questions/requests, hopefully they will be easy ones to answer/implement. First, I would like to run an automated, periodic SPARQL query against a somewhat up-to-date DBpedia endpoint, and get, at a minimum, the and properties. I am currently using the query: SELECT * WHERE { ?rel ?value . } where $title is inserted from a list of (currently) 125 articles I am interested in. I had been running it against and tried changing the endpoint to That server appears to be rate-limited or actively anti-automation, as I got a 503 error on the 9th request (the first 8 went through in a second or two). So, is there a place I can go to, or an API key I can obtain, such that I'd be able to refresh our Wikipedia abstracts on, say, a daily or weekly basis using fresh Wikipedia data? Again, it is a very limited set of articles I am interested in (low hundreds), so the burden on the other end would be fairly minimal, and I can schedule it to whatever time suits you. Secondly, another downside is the two live endpoints mentioned in that thread (the other being Contrast: Against the very detailed: (though the lod2 one seems to be lacking in @xml:lang attributes on its non-English HTML elements!) Does anyone feel like adding these missing predicates to the live DBpediae? My third question is, all of the endpoints provide a foaf:primaryTopicOf edge pointing to the English wikipedia page — surely it should have all languages? Ideally, I would like links to each of the other language Wikipedia articles with some tie between the abstract and the wikipedia URL it came from (as there's not a 1:1 relation between ISO language codes and wikipedia subdomains, e.g. \"ChÅ«-jiân\"@nan -> http://zh-min-nan.wikipedia.org/wiki/Ch%C5%AB-ji%C3%A2n so trying to generate a URI myself using the refs:label + language code will not always work). How could this be done, and is anyone willing to do it? TBH I would be happy if the URIs were merely string literals tagged with the corresponding ISO language, though that's obviously far from ideal in terms of LOD. Perhaps both string literals and an array of (untagged) foaf:primaryTopicOf triples would be good enough. – Nicholas. uOn 10/2/14 12:18 PM, Nicholas Shanks wrote: You should be able triangulate your way to canonical data, across these data spaces, by way of relevant relations (e.g., owl:sameAs, in the worst case). Note, the actual DBpedia URIs aren't changing: Step-by example i.e., follow this step-by-step by clicking on the links and then digesting what's presented: 1. 2. 3. . Conclusion, you don't lose the canonical DBpedia HTTP URI for a given entity. That should remain the focal point of your references and queries in regards to SPARQL when using . uHi, I am the principal maintainer of the endpoints provided by OpenLink. There are two live Dbpedia endpoints at this time: Both of which have rate limiting to curb over-enthusiastic use by a single IP address. Note that at this time the service provided by OpenLink accepts higher connection rates. This does not mean that you can just query an endpoint and bombard it with requests without taking a few precautions. First and foremost since your requests are to the /sparql endpoint using the HTTP protocol, you could and should check the HTTP status requests to make sure your query actually executed successfully. Your code could easily check for a 503 and do the appropriate thing by sleeping an arbitrary amount of time (say 1 min) and try the same again, which in most cases would return a result as the block is lifted. There are other HTP status codes you should probably check too. Secondly since you state you only want to do this 'periodically' you could also just sleep 1 seconds between each query. Since your service seems to be a background lookup, this would ensure you never hit a rate limit either. I checked our systems and our endpoints allow at least 20 requests per second from a single IP address. Should this be impossible for your service, please contact me and i can see what we can do. We are currently looking into this will see if we can add some extra static datasets for existing articles. Note that the various endpoints have different versions of the DBpedia data: lod2.openlinksw.com dbpedia 3.9 This server is scheduled for a reload soon and will load the dbpedia 3.10 / 2014 data dbpedia.org dbpedia 3.10 Data from Wikipedia dump april/may live.dbpedia.org dbpedia 3.10 + updates Up to date with latest page updates from Wikipedia dbpedia-live.openlinksw.com dbpedia 3.10 + updates Up to date with latest page updates from Wikipedia I will discuss this within the dbpedia live team. Patrick uWhat we provide in this case are owl:sameAs links to other dbpedia language editions. Then you can either dereference the uris to get their abstracts (not available in all languages) or construct the wikipedia url by replacing ' dbpedia.org/resource' to 'wikipedia.org/wiki' Using rdfs:label for this is not recommended. You can get additional labels for the same resource from other extractors that do not match the page name. Dimitris" "new WP articles in DBPedia" "uIf I create a new article about a famous scientist in Wikipedia, how long will it be before it is extracted into DBPedia? Does the new Official DBpedia Live Release track new articles as well as edits to existing articles? uAm 09.07.2011 14:53, schrieb Alexander Nakhimovsky: Yes, DBpedia Live tracks both new articles as well as updates to existing ones. The extracted data from the newly created page should show up within the next 1-5 minutes, see Let us know if you observe any problems! Best, Sören uHi Alexander, yes, DBpedia-Live tracks new articles and changes of existing ones. Actually, the change takes few minutes to be reflected on DBpedia-Live. On 07/09/2011 02:53 PM, Alexander Nakhimovsky wrote:" "General schema for SPARQL endpoints" "uI don't have a computer science background and am new to the SPARQL so I apologize If I am asking stupid questions. The way I understand SPRQL schema is that it imports instance files ( instance_types_XX.nt.gz ),property files ( mappingbased_properties_XX.nt.gz) into some sort of tables and the Virtuso server links them together through SPARQL queries. Is there any document that explains how this works? For example a query like - ?s a dbpedia-owl:Actor , the results come from instance_types_XX.nt.gz OR - ?s dbpedia-owl:starring ?o , the results come from probably mappingbased_properties The other doc that explains these relationships/tables in more details? This is important because we then know what files to import to a local virtuoso server in case of memory issue/loading times Regards I don't have a computer science background and  am new to the SPARQL so I apologize If I am asking stupid questions. The way I understand SPRQL schema is that it imports instance files ( instance_types_XX.nt.gz  ),property files ( mappingbased_properties_XX. nt.gz ) into some sort of tables  and the Virtuso server links them together through SPARQL queries. Is there any document that explains how this works? For example a query like - ?s a dbpedia-owl:Actor  , the results come from  instance_types_XX.nt.gz OR - ?s dbpedia-owl:starring ?o , the results come from probably mappingbased_properties The Regards" "Reminder DBpedia Community meeting, 30th January at VU Amsterdam" "uDear DBpedians, organisation of the DBpedia Community meeting in Amsterdam is progressing quite well for the short time we had to organise it. Here are some news: * Schedule: the meeting page and schedule has (and constantly is ) updated: * Twitter hashtag : #DBpediaAmsterdam : * Lunch and drinks are sponsored by the Semantic Web Company ( ( * Almost 50 participants have registered since the last announcement email: # Sponsors: We still have one open slot for a sponsor, who will be linked on the page and prominently placed in the summary report as well. # Presentation submission open: # Register Please register right now here: Hope to see you in Amsterdam, Gerard, Gerald (from Dutch DBpedia), Lora and Victor from VU Amsterdam, Mariano from the Spanish DBpedia and Dimitris and Sebastian" "Error message when using the MappingBasedExtractor on Nippon_Broadcasting_System" "uHello, I've noticed that the MappingBasedExtractor fails when extracting information from the Nippon_Broadcasting_System on WP:en: Anfrage propquery fehlgeschlagen: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' Simply changing the default charset for all tables worked for me: sed -i dbpedia_extraction.sql -e 's/latin1/utf8/g' Maybe it'll help someone :) Regards, Michael" "Filtering results from Dbpedia Lookup." "uHi! I looked up Uris using the Dbpedia Lookup Service like this: query2 = \"Agatha Christie\" HttpMethod method = new GetMethod(\" \"MaxHits=50&QueryString;=\" + query2); Two of the Uris I got were: * * and * Watching both on the browser, we can see that the second: * found as a link on the first: * * I want to add the property OWL:SAMEAS to my author resources, having those URIs as the objects of the triples, but is that correct to have the following triple?* author URI -> OWL:SAMEAS -> * That way, for me I'm connecting the author to something that has nothing to do with him, something too generic. Would it be better to connect to the more specific version only? Like this: author URI -> OWL:SAMEAS -> QUESTIONS: 1-* Is there any way to filter the results, in order to get more specific URIs? * 2-* Another problem are the duplicates, Is there any way to avoid having the same URI being returned twice? * 3- *Why when I specify the *QueryClass* to *Person *I get only this two URIs: * http://dbpedia.org/resource/Category:1976_deaths* ? Thank you in advance! Bye!Hi! I looked up Uris using the Dbpedia Lookup Service like this: query2 = 'Agatha Christie' HttpMethod method = new GetMethod(' http://lookup.dbpedia.org/api/search.asmx/KeywordSearch ?' +  'MaxHits=50&QueryString=' + query2); Two of the Uris I got were: http://dbpedia.org/resource/Category:Video_games_developed_in_the_United_States and http://dbpedia.org/resource/Category:Agatha_Christie_(video_game_series) Watching both on the browser, we can see that the second: http://dbpedia.org/resource/Category:Agatha_Christie_(video_game_series) is found as a link on the first: http://dbpedia.org/resource/Category:Video_games_developed_in_the_United_States I want to add the property OWL:SAMEAS to my author resources, having those URIs as the objects of the triples, but is that correct to have the following triple? author URI -> OWL:SAMEAS -> http://dbpedia.org/resource/Category:Video_games_developed_in_the_United_States That way, for me I'm connecting the author to something that has nothing to do with him, something too generic. Would it be better to connect to the more specific version only? Like this: author URI -> OWL:SAMEAS -> http://dbpedia.org/resource/Category:Agatha_Christie_(video_game_series) QUESTIONS: 1- Is there any way to filter the results, in order to get more specific URIs? 2- Another problem are the duplicates, Is there any way to avoid having the same URI being returned twice? 3- Why when I specify the QueryClass to Person I get only this two URIs: http://dbpedia.org/resource/Category:Fictional_criminologists and http://dbpedia.org/resource/Category:1976_deaths ? Thank you in advance! Bye!" "problem with ASK queries at ?" "uhello all it seems that there is a problem with ask queries using for some queries there are strange results when 'true', for example for ask where { ?y ?z } the result (NTriples) looks like: _:ResultSet2053 rdf:type . _:ResultSet2053 \"ask_retval\" . _:ResultSet2053 _:ResultSet2053r0 . _:ResultSet2053r0 _:ResultSet2053r0c0 . _:ResultSet2053r0c0 \"ask_retval\" . _:ResultSet2053r0c0 \"1\"^^ . _:ResultSet2053 _:ResultSet2053r1 . _:ResultSet2053r1 _:ResultSet2053r1c0 . _:ResultSet2053r1c0 \"ask_retval\" . and so on. however, if i change the query to ask where { ?z } it works as expected (returning 'true'). dunno if this belongs on the bugtracker. if so, please let me know, then i'll put it up there. cheers marco" "Wikipedia dump for the 2015-04 dbpedia dumps" "uHello all I was wondering if someone could point me the english wikipedia \" enwiki-20150205-pages-articles-multistream\" dump from which the 2015-04 dbpedia dumps were extracted. They used to be hosted on dumps.wikipedia.org but are 404 now. Thanks Praveen Hello all I was wondering if someone could point me the english wikipedia ' enwiki-20150205-pages-articles-multistream ' dump from which the 2015-04 dbpedia dumps were extracted. They used to be hosted on dumps.wikipedia.org but are 404 now. Thanks Praveen uHi Praveen, unfortunately we do not host the Wikipedia dumps, maybe we will start doing so in the next releases but didn't decide yet On Fri, Feb 12, 2016 at 12:30 AM, Praveen Balaji < uThanks Dimitris. I realized that. I took this question to the Wikipedia dumps group. Thank you. Praveen On Fri, Feb 19, 2016, 3:41 AM Dimitris Kontokostas < > wrote:" "how to test if extraction of triples on a wikipedia page is working correctly?" "uHi, I am seeing several cases where triples for a resource are missing when querying live.dbpedia.org. For example, for , the director property is missing ( CONSTRUCT {?s ?p ?o} WHERE { ?s ?p ?o. FILTER(?s = ) } However, I don't know the reason for the director property not being returned - 1) is it because the director property was never extracted from the Infobox on this wikipedia page and therefore it is missing from the backend triple store in live.dbpedia.org?2) Or is it because the director property is getting dropped somewhere in the process of the query? Is there a way in which I can test if the extraction of properties for a given wikipedia page is working correctly and expected properties are getting extracted for that page? Does this require me to install the dbpedia extraction framework (or) is there something simpler than that? ThanksArun uHi Arun, DBpedia has two 'modes' the static one, served from the live one served from Static DBpedia is updated once or twice a year with the latest update using data from May/June 2012 [1] The article you are looking into is created last month [2] so it does not exists at all in static dbpedia You can use the live dbpedia for that (but there are some technical problems right now so try later) For improving the framework you can use our online mapping wiki Best, Dimitris [1] [2] On Thu, Feb 28, 2013 at 7:36 PM, Arun Chippada < >wrote: uThanks for your reply, Dimitris. However, I was already using For example, below I listed a few cases, where the ontology property for director is missing from live.dbpedia.org However, the other SPARQL endpoint listed for live.dbpedia - Is there a reason why these properties are missing when querying using the live.dbpedia.org/sparql endpoint? ThanksArun From: Date: Fri, 1 Mar 2013 09:18:05 +0200 Subject: Re: [Dbpedia-discussion] how to test if extraction of triples on a wikipedia page is working correctly? To: CC: Hi Arun, DBpedia has two 'modes' the static one, served from Static DBpedia is updated once or twice a year with the latest update using data from May/June 2012 [1] The article you are looking into is created last month [2] so it does not exists at all in static dbpedia You can use the live dbpedia for that (but there are some technical problems right now so try later) For improving the framework you can use our online mapping wiki Best, Dimitris [1] [2] On Thu, Feb 28, 2013 at 7:36 PM, Arun Chippada < > wrote: Hi, I am seeing several cases where triples for a resource are missing when querying live.dbpedia.org. For example, for , the director property is missing ( CONSTRUCT {?s ?p ?o} WHERE { ?s ?p ?o. FILTER(?s = ) } However, I don't know the reason for the director property not being returned - 1) is it because the director property was never extracted from the Infobox on this wikipedia page and therefore it is missing from the backend triple store in live.dbpedia.org? 2) Or is it because the director property is getting dropped somewhere in the process of the query? Is there a way in which I can test if the extraction of properties for a given wikipedia page is working correctly and expected properties are getting extracted for that page? Does this require me to install the dbpedia extraction framework (or) is there something simpler than that? ThanksArun u0€ *†H†÷  €0€1 0 + uHi Kingsley, Just wanted to get clarification if it is expected that the two endpoints ( CONSTRUCT {?s ?p ?o} WHERE { ?s ?p ?o. FILTER(?s = ) } Or is this an issue with the ThanksArun No. They are variants of the same thing, but clearly not 100% the same. Regards, Kingsley Idehen =20 Founder & CEO=20 OpenLink Software =20 Company Web: Personal Weblog: Twitter/Identi.ca handle: @kidehen Google+ Profile: 740508618350/about LinkedIn Profile: From: To: CC: Subject: RE: [Dbpedia-discussion] how to test if extraction of triples on a wikipedia page is working correctly? Date: Sun, 3 Mar 2013 14:07:49 -0800 Thanks for your reply, Dimitris. However, I was already using For example, below I listed a few cases, where the ontology property for director is missing from live.dbpedia.org However, the other SPARQL endpoint listed for live.dbpedia - http://dbpedia-live.openlinksw.com/sparql, does return the ontology director property for the above resources. But my understanding is that the live.dbpedia.org/sparql endpoint is more recommended, as it is based on the new Java framework. Am I right? Is there a reason why these properties are missing when querying using the live.dbpedia.org/sparql endpoint? ThanksArun From: Date: Fri, 1 Mar 2013 09:18:05 +0200 Subject: Re: [Dbpedia-discussion] how to test if extraction of triples on a wikipedia page is working correctly? To: CC: Hi Arun, DBpedia has two 'modes' the static one, served from http://dbpedia.org and the live one served from http://live.dbpedia.org. Static DBpedia is updated once or twice a year with the latest update using data from May/June 2012 [1] The article you are looking into is created last month [2] so it does not exists at all in static dbpedia You can use the live dbpedia for that (but there are some technical problems right now so try later) For improving the framework you can use our online mapping wiki http://mappings.dbpedia.org Best, Dimitris [1] http://dbpedia.org/Downloads38 [2] http://en.wikipedia.org/w/index.php?title=Silver_Linings_Playbook&action;=history On Thu, Feb 28, 2013 at 7:36 PM, Arun Chippada < > wrote: Hi, I am seeing several cases where triples for a resource are missing when querying live.dbpedia.org. For example, for , the director property is missing (http://dbpedia.org/property/director or http://dbpedia.org/ontology/director), as observed from the results of the below sparql query CONSTRUCT {?s ?p ?o} WHERE { ?s ?p ?o. FILTER(?s = ) } However, I don't know the reason for the director property not being returned - 1) is it because the director property was never extracted from the Infobox on this wikipedia page and therefore it is missing from the backend triple store in live.dbpedia.org? 2) Or is it because the director property is getting dropped somewhere in the process of the query? Is there a way in which I can test if the extraction of properties for a given wikipedia page is working correctly and expected properties are getting extracted for that page? Does this require me to install the dbpedia extraction framework (or) is there something simpler than that? ThanksArun u0€ *†H†÷  €0€1 0 +" "What's disjoint in the dbpedia ontology?" "uPerhaps the Dbpedia Ontology is restricted to OWL Lite, but I'd really like to see some disjointWith statements in it uWhat would change if there were disjoint statements? Are disjoint declarations used for more than just verifying that dbpedia is consistent? 2009/7/30 Paul Houle < >: uPeter Ansell wrote: There are a lot of uses. I wouldn't trivialize the verification part either: the wikipedia \"street level\" view hides serious consistency problems that look quite embarrassing from 8000 feet up: if you're making a product that gives people that 8000 foot view and you don't want to get laughed out of town, you need to deal with them. Also certain inference procedures will fail when applied to inconsistent data, so providing a consistent view is an important processing step u2009/7/31 Paul Houle < >: Consistency would be nice, but relying on it being there might permanently make DBpedia unuseful for strict reasoners, as to change something in DBpedia you effectively have to convince the active editors on Wikipedia that there is an issue. And they don't take kindly to being pushed into positions without long reasoning. If you have ideas for reforming the category structure at Wikipedia then feel free to try, but you won't get a result there without a lot of effort. SemWeb ventures that attempt to do too much already get laughed out of town, but DBpedia seems to have struck a balance, due mostly to the effort by all of the volunteer editors at Wikipedia (although the transformation to RDF definitely takes another level of effort to maintain). Trying to maintain a different category structure in DBpedia to that in Wikipedia is bound to fail due to this reliance on the Wikipedia editors to maintain the millions of different subjects in the way they know best. uPeter Ansell wrote: Well, the dbpedia ontology already has concepts in it that don't seem to exist (in concrete form) inside Wikipedia. For instance, Wikipedia doesn't have well-defined categories for \"Person\" or \"Automobile Model\", instead it's got this network of categories like and then you've got those awful List pages Simply by creating a category like \"Person\" that doesn't really exist in wikipedia, the dbpedia ontology is already going beyond simply reflecting wikipedia and towards interpreting it. This isn't a problem, in my mind, because it's a finite problem. I don't think there's a big disagreement about how to define the function IsAPerson(DbPediaUrl x). Secondly, Persons are one of a limited number of major categories that exist in wikipedia. uPaul Houle wrote: While we're at it - NACE codes for companies would be nice as well ;)" "Reg: GSoC 2017 Chatbot Project - Project Page" "uHi All Once again thank you very much for selecting me for the GSoC 2017 Chatbot Project. As required, I have created a simple project page to track progress. Please do take a look and share your feedback which would be of great help to me. Wiki Project Page: Thanks Ram G Athreya Hi All Once again thank you very much for selecting me for the GSoC 2017 Chatbot Project. As required, I have created a simple project page to track progress. Please do take a look and share your feedback which would be of great help to me. Wiki Project Page: Athreya" "DBpedia Update and Pending Changeover" "uAll, We are about to transition DBpedia from a Virtuoso 5 instance over to a Virtuoso 6.x (Cluster Edition) instance sometime this week after a few days of live testing against a temporary secondary instance [1]. The new release also includes our much vaunted server side faceted browsing [2] that leverages SPARQL aggregates and configurable interaction time for query payloads (*note the retry feature which increments the interactive time window to demonstrate full effects*). In the process of conducting this effort, we discovered that the Yago Class Hierarchy hadn't been loaded (courtesy of some TBox subsumption work performed by the W3C's HCLS project participants such as Rob Frost). In addition, Virtuoso 6.x introduces TBox subsumption via new Virtuoso-SPARQL transitivity options as exemplified below using the V6.0 instance at: ." "loading dbpedia benchmark" "uHi, I am a newbe in dbpedia benchmark, i've installed virtuoso and i have download dbpedia benchmark from loaded into virtuoso using: SQL> ld_dir('/home/username','*.nt',' know how to verify in virtuoso that i've all the data in dbpedia and how can i have sparql test queries. thx for ur help. Hi, I am a newbe in dbpedia benchmark, i've installed virtuoso and i have download dbpedia benchmark from help. uHi, On 04/12/2012 03:39 PM, soumia sefsafi wrote: In order to verify that you have all data loaded, you can simply execute a count query, in order to get the count of loaded triples. So simply use the following query: SELECT count(*) WHERE { ?s ?p ?o } Regarding the second question assuming you are loading the DBpedia data into a local Virtuoso instance, you should have something like Through this page, you can ask some SPARQL queries against your local DBpedia instance u0€ *†H†÷  €0€1 0 +" "Feedback Session Thu 13:30 for the Association (Funding and DBpedia Improvement)" "uSummary: your change to propose how to improve DBpedia and give feedback and add questions is here: how-to-improve-DBpedia" "endpoint returning diff results - browser, jenasparqlservice" "uHi all, I was wondering whats happening the following query returns only yago (root classes) when executed with the browser (firefox)but when running the query with the jena sparql serviceit also returns opencyc concepts and some others (including \"classes\" that apparently doesnt have subclasses) CONSTRUCT { ?y ?label } WHERE { ?sub ?y . ?y ?label . OPTIONAL {?y ?super} . filter(!bound(?super))} by the wayif I also Include in the construct clause the subclasses of ?y?sub ?y then jena cant parse the resulting xmlinvalid chars CONSTRUCT { ?y ?label . ?sub ?y } WHERE { ?sub ?y . ?y ?label . OPTIONAL {?y ?super} . filter(!bound(?super))} thanks for attentionHi all, I was wondering whats happeningthe following query returns only yago (root classes) when executed with the browser (firefox)but when running the query with the jena sparql serviceit also returns opencyc concepts and some others (including 'classes' that apparently doesnt have subclasses) CONSTRUCT { ?y < attention uok the extra results are because the graph uri=] but what can explain the opencyc concepts in the results if they doesnt have any subclass? something wrong with the query? On Sun, Oct 17, 2010 at 4:28 PM, Mauricio Chicalski < ok the extra results are because the graph uri=] but what can explain the opencyc concepts in the results if they doesnt have any subclass? something wrong with the query? On Sun, Oct 17, 2010 at 4:28 PM, Mauricio Chicalski < > wrote: Hi all, I was wondering whats happeningthe following query returns only yago (root classes) when executed with the browser (firefox)but when running the query with the jena sparql serviceit also returns opencyc concepts and some others (including 'classes' that apparently doesnt have subclasses) CONSTRUCT { ?y < attention unevermindsorry for any inconvenience, it was all the default graph uri. the only question that remains is the parsing error in jenamaybe a jena issue or some missconfig best regards On Sun, Oct 17, 2010 at 5:23 PM, Mauricio Chicalski < nevermindsorry for any inconvenience, it was all the default graph uri. the only question that remains is the parsing error in jenamaybe a jena issue or some missconfigbest regardsOn Sun, Oct 17, 2010 at 5:23 PM, Mauricio Chicalski < > wrote: ok the extra results are because the graph uri=] but what can explain the opencyc concepts in the results if they doesnt have any subclass? something wrong with the query? On Sun, Oct 17, 2010 at 4:28 PM, Mauricio Chicalski < > wrote: Hi all, I was wondering whats happeningthe following query returns only yago (root classes) when executed with the browser (firefox)but when running the query with the jena sparql serviceit also returns opencyc concepts and some others (including 'classes' that apparently doesnt have subclasses) CONSTRUCT { ?y < attention uOn 10/17/10 3:23 PM, Mauricio Chicalski wrote: If you don't scope your query a specific Named Graph using the Graph IRI: , your queries will be scoped to the entire Virtuoso Quad Store (which hosts a number of Named Graphs per instance). We have a number of ontologies loaded into the Virtuoso instance that host the public DBpedia SPARQL endpoint. This enables people to perform sophisticated reasoning over DBpedia using a broad collection of ontologies that include: OpenCyc, Yago, UMBEL, SUMO, FAO etc Kingsley" ""live updating", tracking resource changes" "uGreetings DBPedia people, I'm looking forward to DBPedia getting the \"live updating\" feature (what are you calling it? the feature whereby the dbpedia datasets are updated in nearly real time as changes are committed to Wikipedia?). One thing is unclear to me, however: if resource X is changed to resource Y, will there be any explicit mechanism for discovering that, in both directions, after the fact? Obviously there's 'X Y', but I suspect that that won't be sufficient and reliable. Thanks, John This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. uHi, John Muth wrote: We call it the \"DBpedia Live Extraction\". Whenever a page on Wikipedia is moved, we get two events: The old page containing the redirect, and the new page containing the original content. So indeed currently the triple \"old redirect new\" would be generated (and all other triples with the old page as the subject would be removed). Maybe you could give some examples of what you had in mind when you said this might not be sufficient or reliable? Kind Regards, Claus Stadler uHi, The solution with redirects relies on someone setting a redirect in Wikipedia when an article is renamed. I'm not sure if there's a mechanism in Wikipedia to automatically create a redirect whenever an article is renamed. If there is one, then querying for a redirect would be sufficient, otherwise it wouldn't. Best, Georgi uHi Claus, Thanks for your reply. I guess I worry that there's no distinction between a redirect caused by a page being moved and a redirect for other reasons (redirects do appear for other reasons I assume?). So if I want to find all the past identities of resource Y, I can't. Now that might not actually be a big deal, I just want to clearly understand what's possible. What you've said has confirmed that I can at least reliably do the lookup in the other direction: find the current identity of resource X by just following the redirect or chain of redirects forward. John On 14/10/09 13:43, \"Claus Stadler\" < > wrote: This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. uAccording to indeed automatically created for the old page. So that's good news. Minor diversion: if a page is renamed a second time, the redirect from the original page is not automatically updated: the guide advises editors to fix those links themselves. But even if individual editors fail to do so that shouldn't be a problem as long as the chain of redirects still exists we can follow them to get to the latest incarnation. So it sounds like there's no problem as long as we only care about getting from old resource URIs to new. Thanks, John On 14/10/09 14:00, \"Georgi Kobilarov\" < > wrote: uHi, Well the basic idea of the live extraction is to have a synchron version with respect to Wikipedia. So if pages or redirects do not exist in Wikipedia, they should not exists in DBpedia. I think the updates we get from Wikipedia are quite comprehensive, so this goal will be reached. Regards, Sebastian" "Linguistic popularity of a resource in Wikipedia and DBpedia" "uHello all I would like to add to the Lingvoj Ontology [1] a couple of properties enabling to make explicit and easy to query the \"linguistic popularity\" (not sure it's the best word for it) of a resource, singularly in DBpedia. What I mean by that is the set of languages in which a given DBpedia resource has a WP version. This information is generally not completely available, and not easy to extract, from the current DBpedia description. There are generally more of such languages than the ones used for labels and comments in the DBpedia description, and even those are not obvious to extract from the page. Looking for example at my place in DBpedia [2], I find that is is named and commented in 12 languages, but the source WP article [3] has 26 interwiki links, meaning it's described in 27 linguistic versions of WP. Without asking DBpedia to gather labels and descriptions in all those languages, it should be easy to fetch this language list from the interwiki links in the page, and include that in the description of the resource. The size of the list is also a good indicator of the popularity of the topic. I see a lot of applications for having such information available, either for Wikipedia editors (which resources in such category are not yet translated in my language, etc) or data users. What I propose would be to add the following to the Lingvoj Ontology @prefix : :wpLanguage a rdf:Property rdfs:range :Lingvo ; rdfs:label \"available Wikipedia language\"@en ; rdfs:comment \"The language of a Wikipedia version in which the subject resource is described\"@en. Values of this property could be either instances of languages defined in DBpedia, or more standardized values such as those defined by lexvo.org For instance : :interwikiLanguage < And for the linguistic popularity :wpLanguagesNumber a rdf:Property rdfs:range xsd:positiveInteger rdfs:label \"number of Wikipedia languages\"@en; rdfs:comment \"The number of languages in which the subject resource has a Wikipedia article\"@en. Comments welcome before I add those properties. Bernard [1] [2] [3] Guillestre" "OntologyProperty etc." "uBefore I create a huge mess I better ask showing you the first bits I made - even reading the documentation I don't understand well what is meant and get mixed up by the examples I find. I am trying to map the following: This became: in there by now) Now the problem lies in \"GEWAESSER\" (=water) Since a \"water\" is like a river a \"BodyOfWater\" it became an Ontology class: (which IMHO is correct like I created it) Then back to the mapping: here \"water\" needed to be created as a OntologyProperty: don't have a clue if this is correct, because I simply don't find a valid example) I looked at \"river\" to get an example: but what I don't understand is why rdfs:domain = Island It could also be that I simply don't see the obvious - but sorry, I don't see it. For now I will go ahead mapping only existing parts of Infoboxes, but it's a pity because almost none will be complete. Cheers and thanks in advance, Bina Before I create a huge mess I better ask showing you the first bits I made - even reading the documentation I don't understand well what is meant and get mixed up by the examples I find. I am trying to map the following: = Island It could also be that I simply don't see the obvious - but sorry, I don't see it. For now I will go ahead mapping only existing parts of Infoboxes, but it's a pity because almost none will be complete. Cheers and thanks in advance, Bina uHi Sabine There is no need for your new ontology class \"Water\". The class BodyOfWater already represents this general class for water areas. For the Atoll mapping you need a ontology property for a BodyOfWater in that the Atoll is located.To make this clear you should name the ontology property locatedInBodyOfWater. That would look like this: {{ObjectProperty | rdfs: = locatedInBodyOfWater | rdfs: = ex. an ocean is a water | rdfs: = Gewässer | rdfs:domain = place | rdfs:range = BodyOfWater }} But the easier and more abstract way is to take the locatedInArea ontology property that already exists. Sorry, you couldn't understand it because this is a mistake. The domain should be Place. Another example: The domain of your OntologyProperty archipelago should be the new ontology class \"Atoll\" an the range could be a \"Archipelago\" class. If there is no class like that and you don't want to create it, leave out the range property. The range is set to owl:Thing by deafult (the same for the domain). {{ObjectProperty | rdfs: = archipelago | rdfs:domain = Atoll }} Regards, Paul uWhoops, sorry, this didn't go to the list" "mapping testing for non-English mappings" "uHi, we have uploaded various mappings for the el.wikipedia but the \"test this mapping\" link doesn't work anymoreabout a month ago when we uploaded about 2-3 mappings just to see how they work, the testing worked fine is it something that was done on purpose or a misconfiguration? Thanks, Jim Hi, we have uploaded various mappings for the el.wikipedia but the 'test this mapping' link doesn't work anymoreabout a month ago when we uploaded about 2-3 mappings just to see how they work, the testing worked fine is it something that was done on purpose or a misconfiguration? Thanks, Jim uHi, sorry, our server was down. It should work again now. Cheers, Max On Thu, Jul 8, 2010 at 7:42 PM, Dimitris Kontokostas < > wrote: uSorry again, but it doesn't seem to workthe mappings with greek (non-ASCII) characters does not return test resultse.g. (mapping_el) for mapping Βιβλίο , the test does not return anything while for Infobox Album , the test works fine thanks, Jim On Mon, Jul 12, 2010 at 4:11 PM, Max Jakob < > wrote: Sorry again, but it doesn't seem to workthe mappings with greek (non-ASCII) characters does not return test resultse.g. (mapping_el) for mapping Βιβλίο , the test does not return anything while for Infobox Album , the test works fine thanks, Jim On Mon, Jul 12, 2010 at 4:11 PM, Max Jakob < > wrote: Hi, sorry, our server was down. It should work again now. Cheers, Max On Thu, Jul 8, 2010 at 7:42 PM, Dimitris Kontokostas < > wrote: > Hi, > > we have uploaded various mappings for the el.wikipedia but the 'test this > mapping' link doesn't work anymore> about a month ago when we uploaded about 2-3 mappings just to see how they > work, the testing worked fine > > is it something that was done on purpose or a misconfiguration? > > Thanks, > Jim uHi Dimitris, I'm sorry but there has been a Problem with the encoding of non-ASCII namespaces which I just fixed. Your links are working for me now, please try again. Cheers, Robert On Wed, Jul 14, 2010 at 12:24 AM, Dimitris Kontokostas < > wrote: uHi, it works fine! Thanks again, Jim 2010/7/14 Robert Isele < >" "values of "dbpedia-owl:wikPageDisambiguate" - how are they extracted" "uHi all I have a possibly naive question but I am not able to find the answer elsewhere. My task is to extract candidate concepts/entities for an ambiguous term from dbpedia, e.g., \"cat (disambiguation)\". To do so I am looking at the \"dbpedia-owl:wikPageDisambiguate\" field for the dbpedia page: against \"en.wikipedia.org/Cat_(disambiguation)\". I would expect to see more or less all candidates listed on the Wikipedia Disambiguation page to be covered by the dbpedia field \"dbpedia-owl:wikiPageDisambiguate\", however there is quite large discrepancy - out of which the most odd one is taht the candidates on the dbpedia page do not even include the animal sense of \"cat\", and in fact it is included in \"wikiPageWikiLink\". I wonder how exactly does dbpedia extract candidates from wikipedia \"disambiguation\" pages? It is clear to me that some filtering has been done but it is not clear what it is. According to the dbpedia source code documentation in \"extraction_framework/core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala\" which says \"Extract only links that contain the page title or that spell out the acronym page title\", it should selects many candidates that are currently missing in the \"wikiPageDisambiguate\" filed, but now in the \"wikiPageWikiLink\" field. Can any one shed some light on this please? Thanks! uOn 23.05.2012 11:44, Ziqi Zhang wrote: My answer is indirect - do you think that disambiguation pages are the best place to look for the different meanings of a given term? All the articles I read about Wikipedia-based disambiguation uses the names of links that link to given Wikipedia article as a source of candidate senses for a given ambiguous terms. Check out DBpedia spotlight paper [1] or Wikipedia Miner [2] which use this approach. Cheers, Aleksander [1] [2] link with Wikipedia concerns the disambiguation in great detail). uThanks for your quick reply and the pointers. I suppose you do use the Wikipedia disambiguation page to look for candidate senses of a term, considering it as a sense inventory in general. Whether you should use the dbpedia disambiguation page as an alternative I dont know, since it is related to my question and my observation that it is not a 100% mirroring of the wikipedia version. While both [1] and [2] deals with sense disambiguation, it is not clear how they select \"candidate senses\". in [1] \"We use the DBpedia Lexicalization datasetfor determining candidate disambiguations for each surfaceform.\" and by refering to the spotlight webpage \"(Wikipedia) Disambiguations provide ambiguous surface forms that are 'confusable' with all resources they link to. Their labels become surface forms for all target resources in the disambiguation page.\" - which suggests that dbpedia also uses the Wikipedia \"disambiguation page\" to look for candidate senses of a term, but perhaps some \"filtering\" strategies are used; in [2] candidate spotting is not discusssed. As to my question, I am curious in how a \"disambiguation page\" in wikipedia is converted to a dbpedia page, such that in a lot of cases, many candidate links on the wikipedia page (e.g., \"wikipedia/wiki/Cat_(disambiguation)\") are not included as disambiguation candidates on the corresponding dbpedia \"disambiguation\" page (i.e., \"dbpedia.org/page/Cat_(disambiguation)\". Thanks On 23/05/2012 14:49, Aleksander Pohl wrote: uThanks, I see you point. Yes its arguable whether one should assume the disambiguation page as a sense inventory for WSD. It's true that the other approach clearly has certain advantages but also possibly extensive computational overheads. In my case i am considering other tasks rather than WSD, say determinig the degree of synonymity of two words, where no contexts are given for disambiguation. Typically one evaluates this by looking at the possible senses for each term - in theory I can also take the whole knowledge base as the possible candidate sense inventory regardless of what are listed on the \"disambiguation\" page but that seems to be lots of computation and impractical. Still, thanks for your input! On 23/05/2012 16:28, Aleksander Pohl wrote: uOn 23.05.2012 18:04, Ziqi Zhang wrote: It's true that it requires a lot of computation. But both DBpedia extraction framework and Wikipedia Miner provide ready made solutions for this task. What is more, you can download the data extracted from the English Wikipedia by Wikipeda Miner from its homepage [1]. It is not as fresh as latest DBpedia, but still it is quite valuable. Cheers, Aleksander [1] Wiki.jsp?page=Downloads uOn Wed, May 23, 2012 at 5:44 AM, Ziqi Zhang < > wrote: Like some of the other answers, not directly relevant, but another signal that you can use is inclusion in Freebase. Freebase includes basically all of Wikipedia, but aggressively deletes both disambiguation pages and list pages. I'd also point out that not all ambiguous Wikipedia articles are tagged as disambiguation pages. If you look at the split_to hint properties in Freebase, you can find Wikipedia articles which were split into separate concepts after import. Tom p.s. As far as computational complexity goes, none of the things being discussed strike me as being computationally infeasible unless you are on a very, very limited budget. uHi, the DBpedia data [1] was extracted from an old version [2] of the Wikipedia page. That's probably the main reason for the discrepancy with the current Wikipedia page [3] you observed. For example, that version contained a link to [[domestic cat]]. DBpedia only extracts disambiguation links that contain the disambiguated word, and the case must also match. In this case, the disambiguated word is 'Cat', but the link contained 'cat', so it was not extracted. I just changed the DisambiguationExtractor to use case-insensitive matching. That should let us extract a few more correct disambiguation targets in the next release without adding too many wrong ones. JC [1] [2] (or a version close to it) [3] On Wed, May 23, 2012 at 11:44 AM, Ziqi Zhang < > wrote: uHi Thanks for clarificaiton, yes that does make sense. On 24/05/2012 19:03, Jona Christopher Sahnwaldt wrote:" "dbpedia content" "udear all, I am new to dbpedia and wiki related technology so please forgive the naiveté of my question. we are trying to generate a wiki page for each and every scientific article that has been published. we have harvested major bibliographic repositories. We now have all we need as a minimal amount of information for the generation of the page with the corresponding metadata with out exposing the content or violating copyright. We would like the pages to be created automatically, is there an api for that facilitates the automatic generation of wikipages? is this allowed?could this be done? Ultimately I want every piece of available metadata for research articles to be part of dbpedia, for those scietific articles that are open access I am also making available an extended metadata which is based on topics extracted from the content. Also, for open access content we would like to have them as wikipages -we are looking if this is legal. How could I generate content for DBPEDIA without generating a wikipage? we could generate the content for wikipedia for any given published paper; this may then uHi Alexander, On 3/17/15 4:19 AM, Alexander Garcia Castro wrote: Interesting! Could you share a sample of such metadata? You can produce a dataset that complies to RDF standards. If your content is multilingual, I would recommend to use the Turtle syntax [1], which supports UTF-8 encoded IRIs. Cheers! [1] > we could generate the content for wikipedia for any given published uHi Alexander, If you only want to create wiki pages, you can use the wikipedia api and a bot framework such as JWBF [1] of Pywikibot[2].However, publishing this metadata directly to Wikipedia might not be feasible since Wikipedia might see it as spam. If you wish to generate an rdf dataset we will try to give you some helpful tips. Cheers, Alexandru [1] [2] On Wed, Mar 18, 2015 at 11:22 AM, Marco Fossati < > wrote:" "Redirects Dataset" "uHi all, I wonder why the Redirects Dataset exists only for the English Language. I think the generalization of the parser to handle redirects in every Wikipedia language shouldn't be too complicated. Maybe there are some other reasons for this. In this case I'd like to go more in depth. Thanks, Riccardo uOn 15.02.2012 15:06, Riccardo Tasso wrote: Yes indeed, this should be quite simple - The Wikipedia Miner [1] already processes redirects for various languages. Aleksander [1] uRiccardo, Aleksander, you can have a lool at this older threads Cheers, Dimitris On Wed, Feb 15, 2012 at 4:33 PM, Aleksander Pohl < > wrote:" "extraction" "uHi, Do anybody know a description about how to set up an environment for the extraction framework to run it on live wikipedia or on the dumped data file? (I mean which data file I need to download, If I need a DB how it should be configured,) thanks Bottyán" "DBPedia in something other than Virtuoso?" "uHi folks, I'm curious if anyone has successfully loaded the DBPedia dataset into a DBMS other than Virtuoso (NoSQL or otherwise). If so, how has it been? How did it compare to virtuoso overall and more specifically in performance, scaling, etc. I read an article once about somebody loading the dataset into HyperTable which seemed pretty limiting in general query flexibility but was more performant. Thanks, Parsa Hi folks, I'm curious if anyone has successfully loaded the DBPedia dataset into a DBMS other than Virtuoso (NoSQL or otherwise). If so, how has it been? How did it compare to virtuoso overall and more specifically in performance, scaling, etc. I read an article once about somebody loading the dataset into HyperTable which seemed pretty limiting in general query flexibility but was more performant. Thanks, Parsa uHi Parsa, On 01/16/2013 02:10 PM, Parsa Ghaffari wrote: This thread [1] describes how to create a DBpedia mirror on Jena-TDB. And regarding to the performance of each one with DBpedia dataset you can check The DBpedia SPARQL Benchmark (DBPSB), and its results which are available here [2] [1] [2] DBPSB u0€ *†H†÷  €0€1 0 + uKingsley, I'm referring to this article: method is utilized that mimics only a subset of SPARQL at the benefit of achieving a higher performance and in the context of that article specifically, by higher performance I mean faster bulk load speed for DBPedia datasets. I love virtuoso, I only wish the opensource edition had *some *kind of replication at least. Parsa On Wed, Jan 16, 2013 at 2:26 PM, Kingsley Idehen < >wrote: uThanks for the links Mohamed. On Wed, Jan 16, 2013 at 2:22 PM, Mohamed Morsey < > wrote: u0€ *†H†÷  €0€1 0 +" "simple query problem" "ui ran a simple query select ?Company where { ?Company rdf:type dbo:Company. } i got the output but it does not contain renowned companies like google, microsoft why is it so please help me regards mohammad ruksad i ran a simple query select ?Company where { ?Company rdf:type dbo:Company.    } i got the output but it does not contain renowned companies like google, microsoft why is it so please help me regards mohammad ruksad uHi Mohammad, Seems to be an issue with the endpoint, the data is there when I look at the description pages [1][2], the query also works fine on other endpoints [3][4]. I guess it can only be an issue with the endpoint. You can work around this issue by using one of the other endpoints linked here. Cheers, Alexandru 1. 2. 3. 4. On Wed, Jan 6, 2016 at 1:20 PM, mohammad ruksad Siddiqui < > wrote: uHI Mohammed, The SPARQL endpoint at select count district ?Company where { ?Company rdf:type dbo:Company. } which results in call-ret 0 64255 When you actually use curl on your query you get the following information returned: $ curl -I ' HTTP/1.1 200 OK Date: Fri, 08 Jan 2016 16:03:09 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 1331298 Connection: keep-alive Vary: Accept-Encoding Server: Virtuoso/07.20.3215 (Linux) i686-generic-linux-glibc212-64 VDB X-SPARQL-default-graph: X-SPARQL-MaxRows: 10000 Expires: Fri, 15 Jan 2016 16:03:09 GMT Cache-Control: max-age=604800 Accept-Ranges: bytes Where the X-SPARQL-MaxRows header indicates that the results is truncated. To resolve this limitation, the dbpedia endpoint can be called using a paging mechanism using the OFFSET XXX LIMIT YYY mechanism like this: select ?Company where { ?Company rdf:type dbo:Company. } OFFSET 0 LIMIT 10000 select ?Company where { ?Company rdf:type dbo:Company. } OFFSET 10000 LIMIT 10000 … … … select ?Company where { ?Company rdf:type dbo:Company. } OFFSET 60000 LIMIT 10000 Note that other endpoints may have different restrictions. Patrick" "dbpedia extraction errors for links within texts" "uHi, I noticed some extraction errors in dbpedia. Please see if you can fix it. When the wikipedia value contains texts with links in it, dbpedia turns to extract the first wiki link. One example below, \"Barack Obama\" becomes a occupation for \"Stephanie Cutter\" since [[Barack Obama]] appears in the text of occupation. Occupation Deputy campaign manager for President [[Barack Obama]]'s 2012 reelection campaign. dbpedia-owl:occupation dbpedia:Barack_Obama This seems a general extraction errors and also exists for other wiki pages. Two more examples: 1. Britney_Spears extracted as notable work instead of her song * [[Britney Spears]] - \"[[Touch of My Hand]]\" dbpedia-owl:notableWork dbpedia:Britney_Spears 2. Adolf_Hitler extracted an occupation laterwork =Became [[Adolf Hitler]]'s personal pilot dbpedia-owl:occupation dbpedia:Adolf_Hitler Thanks Xiliang uCould you please file a bug at JC On 1 May 2013 02:03, Xiliang Zhong < > wrote:" "Source of VIAF identifiers (and LCCN, )" "uHi all, I was wondering where the identifiers for VIAF (Virtual International Authority File) came from. They (and e.g. LCCNs) are usually input in Authority control templates, but there is no mapping for that template [1]. (Can I make that? I just registered as Bencomp.) Somehow they were mapped to dbpprop:viaf, for which there is no definition [2]. There are about 6500 links-to-outside readily available, waiting to be mapped :) Regards, Ben [1] [2] viaf uHi Ben, You have rights now to edit mappings: Happy Mapping to you! It will be great if you add these kind of authority data mappings. There is some nice info in there for linking data. Cheers, Roland On 01/17/2013 11:07 PM, Ben Companjen wrote: uHi Ben, Roland I've been following this for a while and according to OCLC there should be ~250000 ;) I already tried to see how we can handle this info but unfortunatelly it is not supported by the framework yet so. please do not make any mappings to it for now, they will probably produce a duplicate (blank) node if there is another mapping for the page and will make things worse I already suggested a solution a few days ago in similar thread You should create something like a \"noMapToClass\" directive in the mappings but this needs some work. Until then i think that dbprop:viaf can do the work [1] Best, Dimitris [1] On Thu, Jan 17, 2013 at 11:24 PM, Roland Cornelissen < > wrote: uHi Dimitris, Thanks for interrupting ;) It's a good thing then, that I was held back by the definition of the LCCN property (domain WrittenWork). I'll let the mapping rest for now. Based on my understanding of the \"How to create a mapping\" pages, I was going to use only PropertyMappings, as the template does not \"define\" a person like an infobox; it is included with other templates (I presume). Would they have created new nodes? And what do you mean by \"[handling this info] is not supported by the framework yet\"? Is it too much data? You mention dbprop:viaf, but not how it came into being. I can now answer my previous question: after another look at the wiki [2], I think \"viaf\" and \"lccn\" must have been properties in an infobox template before the Authority control template was introduced. Or are these raw properties still being created from any template property? I've been editing and answering some DBpedia-tagged questions on StackOverflow, but can always learn more. :) Ben [2] On 17 January 2013 23:45, Dimitris Kontokostas < > wrote: uit seems that some authority template instances existed before the latest DBpedia release (e.g. [1]) so these properties were extracted from the infoboxExtractor. Infobox extractor extracts greedily and usually it has very low quality, but in this case viaf is a number so the quality is actually very good. The mapToClass definition is obligatory for now. About the \"new nodes\" behaviour you can read the entries in the wiki or the complete manual (sec 5.2 mostly) Best, Dimitris [1] On Fri, Jan 18, 2013 at 12:58 AM, Ben Companjen < >wrote:" "Test Bulgarian mappings" "uHi, Last week I corrected some of the Bulgarian mappings connected with \" ????????? ??????????\", because they were wrong. How I can test the mappings because it seems that the links on Test this mapping does not work. If you click on Main or File it doesn't return any results. Cheers, Boyan Simeonov Hi, Last week I corrected some of the Bulgarian mappings connected with \" Музикален изпълнител\", because they were wrong. How I can test the mappings because it seems that the links on Test this mapping does not work. If you click on Main or File it doesn't return any results. Cheers, Boyan Simeonov uThanks for the notification. It's probably an encoding problem: extractionSamples/ doesn't seem to work if there are any non-ASCII characters in the name of the template. For example, Mapping_de:Infobox_Fluss works, but Mapping_de:Infobox_Straße doesn't: On Mon, Dec 8, 2014 at 12:38 PM, Boyan Simeonov < > wrote: uI just opend On Mon, Dec 8, 2014 at 8:02 PM, Jona Christopher Sahnwaldt < > wrote: uUntil this is fixed: Boyan, you can test invididual pages here: I noticed this in another email by Dimitris. Thanks, this is extremely useful! E.g. this is Lili Ivanova: - mapped only: - all extractors: Lili Ivanova is now MusicalArtist (previously was MusicalGroup), yay! Boyan will next fix dbo:sex \"a\" to become dbo:gender dpm:Female, and to emit dbo:gender dpm:Male for people who don't have this magical letter \"а\" (which comes from raw prop \"наставка\" meaning \"suffix\": many BG female surnames end in \"а\")" "Dbpedia Spotlight as client." "uHi! I found that code: Class 1: Class 2: Class 3: On the class: DBpediaSpotlightClient, on the part: File input = new File(\"/home/pablo/eval/csaw/gold/paragraphs.txt\"); File output = new File(\"/home/pablo/eval/csaw/systems/Spotlight.list\"); *How can I see the ouput? What is that file .list ?Is there any configuration I have to do to get the output?Thank you!Hi! I found that code: Class 1: you! uThe author of the code (Pablo Mendes) answered me: \"The output is a text file. :) It contains a list of entities extracted. Just look at it with your favorite text editor. Cheers, Pablo\" Just to register the answer here. Bye! 2013/8/10 Luciane Monteiro < >" "uri in french i18n version" "uHi I've downloaded the french i18n version of dbpedia ( ex: surprised to find dbpedia.org uris whereas I was expecting fr.dbpedia.org uri's My expectation seemed to be confirmed by the interlink data ( linking fr.dbpedia.org resources to any other language resources. For example Whereas in the French i18n data the uri used is and not Can someone explain to me how to use this interlink data ? Thanks Michel DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\" uri in french i18n version" "Date of modification" "uHello everybody, is there a way of constraining the age of a resource with regard to when it was created and/or modified? This would allow, for example, to query only those resources that have been recently added to the Wikipedia. Thanks for any hints. Best regards, Marian uHi Marian, we do not extract metadata about Wikipedia articles (revisions, date of modification, etc) at the moment. That's on our todo list for the upcoming work on live-updating DBpedia, but I can't give any timescale for that yet. Best, Georgi" "SPARQL dbpedia returned data" "uHello, For a school project I need to link my own data to the dbpedia data, before I can do this i need to examine some things to see on what property i can link the two.While trying some SPARQL statements on dbpedia to look for a good linking property I noticed that I could not get any results about certain objects, which are exactly the objectsI need. Example: I want to test university labels from my own data against the ones in dbpedia, so to check if I could even find any results I tried the following SPARQL: SELECT ?aWHERE { ?a rdfs:label \"Ege University\"} This returns nothing, even though i copied the label directly from the Ege University page. Does anyone know what's the problem here? Thanks,Camiel uHi Camiel, On 10/16/2011 02:37 PM, Camiel $B\"%\"'\"%\"'\"%\"'\"%\"'\"%\"'(B wrote: In order to get results, you should append the language to the label so the query should be SELECT ?a WHERE { ?a rdfs:label \"Ege University\"@en } Hope that helps." "ImageExtractor getImageUrl" "uI'm trying to use the ImageExtractor but it doesn't seem to work with utf8 characters correctly. For this filename: !!!!東京大学総合研究博物館小石川分館0001.jpg I get But i should get: Any ideas how to get the right hash prefix(d/d8)? I tried url encoding the file name first but that did not produce the right hash prefixes." "Consultation: DBpedia Spotlight Internationalization" "uHi, I'm working in Dbpedia Spotlight Internationalization from AppStylus SL in Barcelona (Spain) and I have been searching the DBpedia dumps in repository ( aren't dirs named cn, jp and kr for China, Japan, and Korea respectively. Could you tell me which are the dirs in the dbpedia repository for Chinese, Japanese and Korean languages? Thank you very much!, Jairo Sarabia AppStylus SL developer Hi, I'm working in Dbpedia Spotlight Internationalization from AppStylus SL in Barcelona (Spain) and I have been searching the DBpedia dumps in repository ( developer uHello Jairo Generally \"ko\" stands for Korean in DBpedia (Ref.: So you can look for the directory named \"ko\" for Korean dump. For Chinese or Japanese DBpedia dump directory wait for the others to reply Warm regards. Arup On Tue, Dec 20, 2011 at 10:09 PM, Jairo Sarabia <" "WORLDCOMP and Hamid Arabnia" "uDefamation campaign against WORLDCOMP and Hamid Arabnia For the last several months, a systematic defamation campaign is going on against the worlds' biggest computer science conference WORLDCOMP, eg. WORLDCOMP is addressing this matter legally and a lawsuit has been filed to resolve this matter (visit WORLDCOMP's website side). Our preliminary investigation found the footprints of the actual persons who are sending all these defamatory comments about WORLDCOMP and its Chair Hamid Arabnia. As of now, these are the persons behind this defamatory campaign: http://www.cis.famu.edu/~hchi http://www.scs.gatech.edu/people/mustaque-ahamad http://www.cs.fsu.edu/~xyuan http://www.unf.edu/~ree http://www.johnlevine.com http://curly.cis.unf.edu http://en.wikipedia.org/wiki/Albert_Shiryaev http://www.cse.sc.edu/~jtang http://www.ninaringo.com http://www.cis.famu.edu/~prasad http://www.scs.gatech.edu/people/maria-balcan http://www.f4.htw-berlin.de/~weberwu http://www.iaria.org/speakers/PetreDini.html http://www.eecs.ucf.edu/index.php?id=profiles&link;=joseph_laviola (more names will be announced later on…) These people formed a team and mailing to different forums, groups, blogs and individuals, heavily criticizing WORLDCOMP. Some of them have personal or professional enmity with Professor Hamid Arabnia and some of them don't like WORLDCOMP for one reason or the other. They are using proxy servers in Georgia (Athens, Atlanta), Florida (Tallahassee, Jacksonville, Orlando), Chicago and Texas (Austin, Houston) and sending the defamatory emails. I request all of you to submit papers and make WORLDCOMP 2012 a success. All tracks of WORLDCOMP have received high citations. I assure you that WORLDCOMP will be held in July 2012 and it will continue for many years to come. I know Professor Hamid Arabnia well and he is a very nice and professional person and he is committed to organize WORLDCOMP in 2012, 2013, 2014, 2015, With sincere respects, Mohammad Homayoun Note: This message is sent to help defend my longtime friend Professor Hamid Arabnia http://www.cs.uga.edu/~hra (chair and coordinator of WORLDCOMP)" "errors in the page "Datasets loaded into the public DBpedia SPARQL Endpoint"" "uHello, I am writing concerning the page at It lists the subset of the dbpedia download files that are loaded in the public SPARQL endpoint. I noticed a couple of issues with the list: (1) The download files, listed on the page as: \"infobox_properties_en\" and \"infobox_property_definitions_en\", are actually called \"raw_infobox_properties_en\" and \"raw_infobox_property_definitions_en\". (2) If I execute the following query: SELECT (COUNT(?s) AS ?c) WHERE { ?s rdfs:label ?label . FILTER (REGEX(STR(?label), \"embarcadero\", \"i\")) } LIMIT 1000 on dbpedia, and on a local triple store with only the files from the page loaded, I get two different counts. I get 189 from dbpedia and 96 from my local store. It seems to be that the list on the page is incorrect, and as evidenced by (1) the page is not generated by a process occurring during the dbpedia build procedure. Is it possible to get an up to date list (this would greatly benefit my work)? and preferably one that is not generated by hand but from a definitive source that is part of the SPARQL endpoint database build process? And maybe some notes on the web page stating how the list is obtained would be very useful. Cheers, Tim Harsch Hello, I am writing concerning the page at Harsch uOn 1/17/14 6:25 PM, Tim Harsch wrote: The public DBpedia SPARQL endpoint provides access to more than one named graph in the Virtuoso DB from which it is deployed. If your query against the DBpedia endpoint isn't scoped to the graph IRI you will encounter these kinds of issues." "FYI: Get Semantic with DBPedia and ActiveRDF" "uHi, there is a new blog post that show how DBpedia is used together with ActiveRDF. See Cheers Chris" "Reminder skos:subject is deprecated categories at dbpedia.org/resource/category:" "uNot sure what is the answer to Amir's question is, but a side remark : skos:subject has been deprecated in the final SKOS recommendation [1] and IMO should be replaced throughout DBpedia (and the linked data cloud in general) by dcterms:subject [2] It's been discussed at length on DC list if the range of dcterms:subject should be restricted or kept open as it is now, but in any case using skos:Concept instances as values of this property is considered a good practice from both SKOS and DC sides. If I remember well, skos:subject was removed from the spec on the very argument that SKOS should not specify how concepts should be used for indexing, letting this to metadata specifications such as Dublin Core. And singularly dcterms:subject was in the radar. Should not be to difficult to change. Since DBpedia is the showcase of Linked Data seems to me it should lead the way in good practice re. conformance to standards :) Cheers Bernard [1] [2] 2009/12/16 Amir Hussain < > uBernard Vatant wrote: While this change gets synchronized (at the data set level) we can do the following: 1. Make an Inference Rules Context that asserts equivalence between dcterms:subject and skos:subject 2. Publish the rule so that when constructing SPARQL against DBpedia people can optionally enable the Inference Context via our SPARQL pragmas feature. Kingsley" "OpenCorporates" "uI have started a discussion [1] on addling links from Wikipedia articles about companies, to the respective pages on OpenCorporates; and mentioned the possibility of including them in DBPedia,too, since OpenCorporates is an open database, available as xml, json or rdf. Please feel free to comment, or advise on how this might be achieved, in that discussion, or this mailing list. [1] [2] uOn 6/22/11 11:42 AM, Andy Mabbett wrote: You can just do the following, pronto: 1. Make the links to DBpedia URIs 2. Publish you linkset in a format of your choice. Once done, I have it loaded into its own Named Graph so that others can evaluate. In due course, it will become part of the DBepdia standard linksets. I think you can execute this right now :-) uOn 22 June 2011 11:47, Kingsley Idehen < > wrote: Does Wikipedia allow links to link to DBPedia? I have no idea how to do that; and if you read the discussion at [1], you;ll see that I don't represent OpenCorporates. I will forward your suggestion to OC. uAndy, (cc. OC) Thanks for the idea! If Open Corporates offered this linkset to DBpedia, it would be a nice candidate for featuring in the next release of the LOD Cloud Diagram ( I feel this would have good impact on their adoption, and could be easily achieved by creating a simple Silk Link Spec (maybe as easy as comparing company name and address) and running one of the Silk versions ( A link to a download file of this linkset could then be entered in CKAN ( Is OC interested in that? Any volunteers from this list for helping OC achieve that? Cheers, Pablo On Wed, Jun 22, 2011 at 12:55 PM, Andy Mabbett < >wrote:" "some invalid domain, range, subPropertyOf" "u1) Consider it specifies `rdfs:domain Mountain, Volcano`. The author of that mapping probably thought this means that the property `firstAscent` should apply to `Mountain` or `Volcano`. But by RDFS semantics, when you specify multiple classes as domain/range for a property, then every subject/object of a property is inferred to have all these classes. Eg in the above case, any subject will be inferred to be both `Mountain` and `Volcano`. Furthermore, the ontology generator doesn't emit two classes, but one invalid class URI: ``` dbo:firstAscent rdfs:domain ; ``` 2) Some subProperty statements have an object spelt in Uppercase. - in some cases this leads to a statement that does not connect to the intended property (in this case dbo:medalist), e.g.: ``` dbo:silverMedalist rdfs:subPropertyOf dbo:Medalist ``` - in other cases it leads to a statement which links to a class, which is a mistake ``` dbo:senator rdfs:subPropertyOf dbo:MemberOfParliament . ``` Out of 62 subProperty declarations, 20 have this problem: dbo:bronzeMedalist rdfs:subPropertyOf dbo:Medalist . dbo:codeLandRegistry rdfs:subPropertyOf dbo:Code . dbo:codeMemorial rdfs:subPropertyOf dbo:Code . dbo:distanceToCapital rdfs:subPropertyOf dbo:Distance . dbo:dutchMIPCode rdfs:subPropertyOf dbo:Code . dbo:goldMedalist rdfs:subPropertyOf dbo:Medalist . dbo:iso6391Code rdfs:subPropertyOf dbo:LanguageCode . dbo:iso6392Code rdfs:subPropertyOf dbo:LanguageCode . dbo:iso6393Code rdfs:subPropertyOf dbo:LanguageCode . dbo:musicalKey rdfs:subPropertyOf dbo:Type . dbo:officialSchoolColour rdfs:subPropertyOf dbo:ColourName . dbo:otherWins rdfs:subPropertyOf dbo:Wins . dbo:politicGovernmentDepartment rdfs:subPropertyOf dbo:Department . dbo:protectionStatus rdfs:subPropertyOf dbo:Status . dbo:rankingWins rdfs:subPropertyOf dbo:Wins . dbo:senator rdfs:subPropertyOf dbo:MemberOfParliament . dbo:silCode rdfs:subPropertyOf dbo:LanguageCode . dbo:silverMedalist rdfs:subPropertyOf dbo:Medalist . dbo:subTribus rdfs:subPropertyOf dbo:Tribus . dbo:superTribus rdfs:subPropertyOf dbo:Tribus . 3) I'm not sure whether it's a good idea to have classes and properties that have the same name, except capitalization. The mapping wiki uppercases properties, e.g. So the difference between these two is lost on people. There are 152 terms with duplicate names (see attachment). But I don't imagine it's feasible to change all these now uOn 12/5/14 11:19 AM, Vladimir Alexiev wrote: RDF Schema spec is clearly broken, in this regard. Most would expect the domain of a property to indicate the nature of relation (represented by RDF statements) subjects, and the range to do the same for relations objects. RDF Schema (as you've indicated, based on current spec [1]) asserts that given the following definition of the nature of a <#related> predicate (sentence/statement forming relation type): <#related> rdfs:domain foaf:Person, foaf:Document . the following entity description, represented by the <#related> relation below: <#this> <#related> <#that> . implies that: <#this> is an instance of disjoint classes foaf:Person and foaf:Document, rather than the fact that the <#related> relation have subjects that are foaf:Person or foaf:Document class instances . This is a nice example of why Schema.org introduced the schema:domainincludes [2] property. Links: [1] [2] ." "Links which are showing 404 Error" "uHi, Though I know this is not the right forum to inform such issues but I couldnot find any better place. The following links are showing 404 Error uyes, I too faced the same issue trying to download spotlight quick-start zip. Appreciate if someone can point to a working link to download the zip file. Thanks, Dileepa On Tue, Apr 23, 2013 at 4:32 PM, Arka Dutta < > wrote: uHi all, The access to the downloads server has been restored. Thanks, Pablo On Tue, Apr 23, 2013 at 7:48 AM, Dileepa Jayakody <" "Community coordination action: DBpedia reproducubility / Dockerization" "uHi Everyone, During the last DBpedia meeting, we decided to create a community coordinated action for making the DBpedia SPARQL endpoint reproducible. After a little brainstorming we came up with the following goals: - with each release create docker images - spread the docker images over servers from the community - keep a lit of all endpoints on the DBpedia website, first manually, then automatically updated Option 2: Community crowd-sourcing, i.e. uptime can be improved when we take off the heaviest users. For example, packaging DBpedia in docker and offering an easy way for configuration should help potential exploiters to do it on their own infrastructure, thus freeing resources to incidental (and less skilled) users. First steps are to create a list of public DBpedia endpoints and an official tutorials on setting up a DBpedia mirror. - scientific reproducibility -> for therefore Docker images As decided at the meeting, Jörn Hees will lead this action but we also identified some members that have done work in this field like Natanael Arndt, Markus Ackermann, Ritesh Kumar Singh and Kay Muller (in cc) *Next steps( - Everyone (as well as other community members that we didn't include) will present their work here - We will create a task force lead by Jörn and work on the above (or new) goals Looking forward to getting everyone's input here Cheers, Dimitris uHi, thanks to Dimitris for the introduction. As mentioned, i'm happy to coordinate the efforts to improve reproducibility of the online DBpedia endpoint. My background on this: I've been running a local Virtuoso Linked Data endpoint for our research group for quite some time now. Amongst others, we develop learning algorithms for Linked Data and they perform tons of mean SPARQL queries. Running these against online endpoints would disrupt their service and isn't really fair-use. So local mirror it was and as DBpedia is pretty interesting for us as glue for many datasets, i've always tried to keep up with the latest releases and host them locally. I started documenting this in form of line-by-line bash HowTos publicly, i think back with DBpedia 3.5 and updated the guide a couple of times. Over the time the process became easier, but it's still not what i'd call \"simple\". More than 200 monthly readers of each of my guides indicate that we should make this easier. In the latest revision of the guide, i also started dockerizing stuff, allowing me to quickly switch between different database snapshots for evaluations: Coming from this background, I fully agree with the goals that Dimitris mentioned. To kick things off, i already created a small overview document for the first \"phase\" reproducibility: You're invited to edit, discuss and get involved. Cheers, Jörn uThe availability of DbPedia as a virtuoso DB file was really helpful for me. I hope this feature will still be maintained on the long run. (And the fact that linkedgeodata was just one namedgraph away was also pretty cool!) Envoyé de mon iPhone uHi Olivier, Yes, that was an (inofficial) option that i offered in the past (simply a compressed backup of the whole Virtuoso DB directory directly after import). Unlike the docker approach it doesn't bundle a fixed Virtuoso version with the DB, so you'd need to install Virtuoso on your own. While the main focus here is make reproducibility very easy (so same data and executable as at release time), there are scenarios in which it's desirable to run an old DBpedia version with a newer Virtuoso version than at release time (think of SPARQL 1.1). So maybe we should also have a look at how to ship a DB version. As the DB snapshot is however very easy to extract from a docker image (probably a volume anyhow), I think this could be an easy addenum on top of (or as part) of the dockerization efforts. Best, Jörn uHi everybody, here is some summary of what we did so far: at SEMANTiCS 2015 we've presented dld: dockerizing linked data [1]. The idea of this was to provide a tool that basically creates docker-compose setups to easily create an infrastructure for serving RDF datasets via a SPARQL endpoint or other applications like OntoWiki. We've followed the principle of having one container per service, so we didn't want to put the complete stack including the data into a single container (Microservices [2], Single Responsibility Principle [3]). To achieve this we have identified three (or four) tasks in a setup: (1a/b): load and back-up data, in case of DBpedia, there would only be loading data (2): storage of the data (3): presentation, exploration and editing data This idea is meant to be very generic to cover different types of setups dealing with RDF data or data in general. For the case of DBpedia and also to achieve a \"scientific reproducibility\", as Dimitris mentioned, we might have to rethink this setup. Further I think also in the docker community best practices have evolved. We should also keep an eye on performance of services running in the containers vs. running them \"natively\" on a system, we have experienced some impact here already, but this might differ from setup to setup. [1] [2] [3] All the best, Natanael (Sorry for sending it twice, but I first had to subscribe)" "question with dbpedia Virtuoso endpoint" "uHi, I have been trying to run the following sparql query on me why? PREFIX dbo: SELECT DISTINCT ?x, ?z, ?y WHERE { ?z a dbo:Organisation . ?y a dbo:RadioStation . ?0 rdfs:label ?label0 . ?label0 bif:contains '\"Louisiana\"' . ?x dbo:state ?0 . ?y dbo:city ?x . ?y dbo:owner ?z . } Instead, if I don't use the bif:contains function, the query works, as shown below PREFIX dbo: PREFIX db: SELECT DISTINCT ?x, ?z, ?y WHERE { ?z a dbo:Organisation . ?y a dbo:RadioStation . ?x dbo:state db:Louisiana . ?y dbo:city ?x . ?y dbo:owner ?z . } However, it seems the bif:contains works for some other queries. Many thanks, Lushan Han Hi, I have been trying to run the following sparql query on Han u0€ *†H†÷  €0€1 0 + uDo you have too many quotes ' ' \" \" around Louisiana? (unfortunately can't test now as the endpoit seems to be down) Best Pablo On Oct 23, 2011 6:39 PM, \"Lushan Han\" < > wrote: uHi, Thank you for letting me know the more powerful sparql endpoint. The query works now. Thanks, Lushan On Sun, Oct 23, 2011 at 4:49 PM, Kingsley Idehen < >wrote: uHi Kingsley, It is nice to know that a powerful endpoint is provided for LOD cloud. And I would like to know what other datasets are included and the total size of data in the endpoint. Thanks Lushan On Sun, Oct 23, 2011 at 4:49 PM, Kingsley Idehen < >wrote: u0€ *†H†÷  €0€1 0 +" "Category Node?" "uI'm trying to use the WikiParser to determine the category list of a wikipedia page. The category tags are represented as TextNode objects but when I print out the toWikiText, it get an empty string. Should categories be \"TextNodes\" and if so, what's the correct extract the category name from the wikipage? input data: {{Colonial Colleges}} [[Category:New York| ]] [[Category:Former British colonies]] [[Category:States of the United States]] {{Link FA|es}} My code snippet: val testFile = new java.io.File(\"src/test/resources/datasource/wikipedia/xml/new_york.xml\") val parser = WikiParser() val xmlSource = XMLSource.fromFile(testFile) xmlSource.foreach{ wikiPage => val page = parser.apply(wikiPage) page.children.foreach{ node => node match { case template:TemplateNode => { println(\"template:\" + template.title + \" with \" + template.children.size + \" children\") } case section:SectionNode => println(\"section:\" + section.toWikiText) case text:TextNode => { println(\"text:\" + text.toWikiText + \" line: \" + text.line) } case link:LinkNode => { val label = link.children.map(_.toWikiText).mkString(\"\") } case x => println(\"class= \" + x.getClass) } } Output: text: line: 939 template:en:Template:Colonial Colleges with 0 children text: line: 940 text: line: 942 text: line: 943 text: line: 944 template:en:Template:Link FA with 1 children text: uHi, On Tue, Jul 19, 2011 at 03:19, Tommy Chheng < > wrote: You can use org.dbpedia.extraction.mappings.ArticleCategoriesExtractor [1] for this task. It extracts triples with dc:subject as predicate. The category tags are actually InternalLinkNodes. That might have been the problem in your provided code. Cheers, Max [1] ArticleCategoriesExtractor.scala" "JSONP callback REST" "uHi, I am using the I specify output of json with ?output=json. Is there a way I can specify a callback to a JavaScript method to handle responses using JSONP? What is the name of the callback parameter key? Is the Thank you. Brian" "indexing DBpedia with Swoogle; biological data in DBpedia" "uHi Joel, This would be great :-) The dataset is pretty well interlinked over the different category systems and over people who work in different areas and life in the same place. So, my guess would be, if you start with a bunch of URIs from different areas and dereference any URI you are getting in the results, you should get pretty good coveradge. Maybe take the ones from There are around 1.95 million things in DBpedia and it would really be interesting to know how many you get with this approach. It would also be interesting to know how many descriptions you get through following the RDF links into the external interlinked datasets. My guess would be alltogehter maybe 4-5 million. So, please try! This is really exiting! How many URIs does your crawler dereference per second? OpenLink is pretty optimistic about the stability of the server, but maybe you should not try too many at once. Kingsley: Do you have an idea how many crawling requests the server can take per second? Yes, the problem with this boxes is that they do not use infobox template, but that many of them are pure tables. Therefore, they are currently not indexed propertly by DBpedia. The DBpedia extraction framework allows different extractors to be pluged-in. Therefore, this could be solved by writing a special extractor for bio and chemistry data. There were some people on the dbpedia list a while ago who wanted to do this, but we did not get any results from them yet. Georgi: Do you know anything about their progress? This highly makes sense and would be really cool. I saw on the SPIRE page, that you are allready providing a SPARQL endpoint over the data. Do you also serve that data as Linked Data with dereferencable URIs? If not yet, you can just throw Pubby ( order to get the dereferencing into place. The next step would be to generate RDF links between your URIs and DBpedia URIs. As we do not have a clue about animal taxa, I guess it would be best, if you generate these links and send them to us, so that we can load them into the DBpedia RDF store. Some more information about how to generate links is found here: I love both ideas. More interlinked dataset are great and you should also announce them on the LOD mailing list and put them into the LOD dataset list once they are published as Linked Data ( Now that we have lots of relevant data sets interlinked on the Semantic Web, I think search engines a really important for showing people what can be done with the data. Therefore, I think it would be really cool if the LOD data gets loaded into Swoogle and I hope that other search engines like SWSE and Zitgist do the same shortly. Keep on the great work on Swoogle. Cheers Chris uChris Bizer wrote: Chris, You know that is an open ended question bearing in mind a myriad of factors when dealing with the internet :-) Should Swoogle simply consume the public RDF dump? Personally, I don't see the value in this type of crawling (*I believe Fred and others wrote about best practices in this realm a while back*). uHi all, can Google seems to have crawled about half a million DBpedia resources [1], so with that in mind server-load seems to not be a problem with Virtuoso :) Looking at the Google cache [2] that happened about a week ago at last. But I don't know if Google really crawled all 473k documents like the result-page [1] claims Cheers, Georgi [1] [2] ans_Hermann_von_Katte+site:dbpedia.org+dbpedia&hl;=de&ct;=clnk&cd;=91≷=de uGeorgi Kobilarov wrote: Georgi, Very interesting! Also see: We'll if Google could come in and get all of this then I guess Swoogle should too :-) That said, Giovanni's approach is preferred (personally). Kingsley uHi all, Well, experience and logic tell us that one shouldn't crawl X number of million of documents if a dump is available. We got 2.x million livejournal in about 3 months (really dependant on livejournal's connection quality). We started to crawl geonames (6.4 million documents) and we planned, at that time, that it would have taken months to crawl everything. Also, this caused issues on Marc's servers. This said, everybody should download and index the rdf dump if it is available (a matter of a couple of hours). Then, if the system send newly created and/or updated documents to pingthesemanticweb.com, then consumers of that data will be able to periodically synch with the system, via ptsw. In any case, I won't suggest to spend everybody's bandwidth and time by dereferencing all URIs to get data available in a single downloadable file. Take care, Fred uHi all, Well, this seems good, but could I suggest to check log first? Google results are pure statistics. The logs of the server would confirm. Take care, Fred ujoel sachs wrote: Joel, What's the SPARQL endpoint for swoogle? It isn't easily discernable from: Also, maybe now is a nice time to revisit synergy between swoogle and pingthesemanticweb? Kingsley uHi all, In that case you only have to contact me so that we can sort how this can be done. I already have some ideas about possible interactions between both systems, but we should dig further to see what is best for Swoogle, PTSW and the whole semweb community. As always, I am easily reachable via email, skype or IRC. Many people would benefits from such an interaction. I started PTSW more than one year ago to help people finding RDF documents to develop their projects and ideas; since then, sindice.com, doapstore.org and other stealth projects are using the PTSW web service. My goal is reached, and having Swoogle participating in PTSW would make things even better; it would be an even greater vector of development for the semweb community. Take care, Fred" "DBpedia Live and articles deleted from Wikipedia" "uHello DBpedia experts, It would be really helpful to hear your feedback on this: DBpedia Live appears to store records which have been deleted (over a year ago) from Wikipedia. Here's an example. This record currently appears in DBpedia Live (as of today June 9 2013): Nick u0€ *†H†÷  €0€1 0 + uThanks Kingsley for your comment. This usefully confirms the issue in 2 instances of DBpedia-Live. Can somebody explain it? Why does an article which was deleted from Wikipedia over a year ago appear in DBpedia Live here: And here in an alternative instance: Here's the deleted Wikipedia page: Is it intentional that deleted articles persist, or is it a bug? Many thanks in advance if anyone can explain this. Best wishes, Nick uHi Nick, On 06/10/2013 12:39 PM, Nick Andrews wrote: The current framework is able to handle the case of deleting a Wikipedia article, and that issue has been fixed a few months ago. Page [1] for instance, has been removed from Wikipedia, and it has been deleted from DBpedia Live as well [2]. The problem with that article is that it has been deleted in Jan 2012, and the current framework fetches only the pages which were revised not more than 3 months ago. We have fixed that issue as well, so the framework now regularly checks those pages that were revised a long time ago and reprocess them as well. We will try to deploy the revised framework as soon as possible. Sorry for any inconvenience. [1] [2] All_Shall_Fade" "Question about languages" "uHi, I have a question that probably has a very simple answer: You can enter different languages by setting the domain with a prefix like ru.dbpedia.org/etc. But how do I get it in english? en.dbpedia.org does not work and dbpedia.org gives me all the languages with no language definitions per abstracts etc. BR, Timo uHi Timo, On 12/23/2011 11:53 AM, Timo Wallenius wrote: You can use It's for English language only." "range of foaf:homepage in DBpedia - document please, not anyURI literals" "uHi folks I was just trying out the new extraction framework (nice work :), and looking over the ntriple/nquad dumps, when I noticed this: $ grep \" versus from the RDF/XML download page in the main site, grep homepage ~/Downloads/Neil_Gaiman.rdf In NTriples this is $ rapper -o ntriples ~/Downloads/Neil_Gaiman.rdf | grep homepage rapper: Parsing URI file:///Users/danbri/Downloads/Neil_Gaiman.rdf with parser rdfxml rapper: Serializing with serializer ntriples . The latter is the correct usage of foaf:homepage; it doesn't relate a person to a xmlschema-datatyped literal, but to a document. This is so that it can have other independent properties and relationships. I found this by chance. On closer inspection, it seems to be a difference between the data generated by latest version of extractors (I downloaded and ran it last night (until my disk filled up:)), and the current dbpedia: when I look at the latest downloadables they are ok too: grep Gaiman homepages_en.nq . I haven't looked at the handling of other FOAF properties, nor of your own vocab, so I have no idea if this change is part of a bigger situation. For foaf:homepage it would be great if you could revert to the document-valued treatement of the property. cheers, Dan ps. I'm trying to grap *all* the links from wikipedia to twitter; is the external_links dump going to cover that, or they only include urls from the explicit 'external links' section of the page? running \"grep -v External external_links_en.nq\" I find some lines that don't trace directly to an External Links section. When I check them they're sometimes subsections of External Links, but not always. This seems good news for me. pps. (and completely offtopic, but as context for why I'm digging into this stuff) where this gets really interesting is when we start cross-checking RDF assertions from different sites. Note that Twitter has the notion of a \"verified account\". So - Twitter assert via \"verified\":true, \"url\":\" which is reciprocated in the wikipedia/dbpedia data, although indirectly. Twitter are saying that the Person controlling the twitter online account 'neilhimself' also has a homepage of matching that description with various other characteristics. A lot of the apps I'm interested in don't need to do this kind of cross-check but it is nice to see that it could work, at least for well known people. This is btw the same logic Google implemented in their social graph API using XFN and FOAF -" "how mapping DBpedia" "uhellocan you help me in mapping English Dbpedia to other language?i want to mapping the English version instead of creating it from zero. what is you advice and how can i do that?best regardsEng.Aman A.Slamaateacher assistant in Nile Academy hello can you help me in mapping English Dbpedia to other language? i want to mapping the English version instead of creating it from zero. what is you advice and how can i do that? best regards Eng.Aman A.Slamaa teacher assistant in Nile Academy" "Inquiry about the path length of DBPedia datasets" "uDear Sir/Madam, I am a Ph.D student at computer science department of North Carolina State University, USA. Our research group are working on a project which may use DBPedia as the experiment dataset. But we got some questions about the dataset. 1.  In DBPedia, is there some pathes whose length is more than 3? 2. Or is there some tools that we can use to generate some graph paths whose length is more than 3? If so, we can use the dataset to demonstrate the superiority of our approaches. Thanks a lot." "Table Extractor" "uDear DBpedia’s community, I am here to announce you the results of my GSoC work: The Table Extractor. Extended REPORT and PROGRESS page [1] REPOSITORY with code, extraction’s data set and log examples, readme file [2] Little report: The aim of the project was to extract useful rdf data set from tables spread all over wiki pages. A lot of tables are used to store data of electoral, sports, competition results [3]. I started the extraction focusing on electoral results, firstly regarding USA presidential elections and therefore in general (it chapter). I had hard times finding the right solution due to a unfortunate solution I initially projected (involving JSONpedia). Please refer to progress page [1] to have more info about that. The final solution manages html page’s representations instead of json ones and it consists in a Python package, which works well on different table’s structure. It works on every wiki chapter, but data extraction strictly depend on rules created on a topic/language base. In fact tables are completely different in data and structures depending on the wiki chapter. Compare [4](it chapter) [5](en chapter) and [6](de chapter). Good results have been achieved on Electoral topic (it chapter). Hope to have the feedback from the whole community. Pills: USA elections results (it wiki pages): 1,8 % table’s lost due to structure’s problems. 87,3 % cells correctly extracted and correctly mapped. 1,9 % cells lost due to a lack of mapping rules. General elections results (it wiki pages): 8 % table’s lost due to structure’s problems. 43 % cells extracted and correctly mapped. 42,6 % cells lost due to a lack of mapping rules. Student: Simone Papalini Mentors: Marco Fossati (DBpedia), Claudia Diamantini, Domenico Potena, Emanuele Storti [1] [2] [3] [4] [5] [6] Pr%C3%A4sidentschaftswahl_in_den_Vereinigten_Staaten_2000" "A quick analysis of the classes in the DBpedia ontology" "uI did a quick analysis of the classes in the DBpedia ontology and found quite a few issues that I think need attention. - Many classes have no instances. Each of these empty classes should be examined to see whether they should be removed or modified. - The sports-related groupings are differentially populated, differentially organized, and unaxiomatized. These groupings should be regularized and minimal axiomatizations provided for them. For example, there would be classes for Basketball under at least SportsLeague, SportsTeam, Coach, and SportsEvent each defined as the restriction of the grouping elements related to Basketball. The sports groupings include Sport (which is special), SportsLeague, SportsTeam, Athlete, Coach, SportsTeamMember, SportsManager, SportsEvent, SportFacility, SportCompetitionResult, SportsSeason, and Tournament. - Numerous stated inclusion relationships are not correct when considering the normal definition of the class names. Each of these should be examined and either descriptions of the classes that support the inclusion relationship be provided or the relationship itself modified. For example, instances of the RecordOffice class do not appear to be non-profit organizations. Some other examples of questionable or outright incorrect subclasses here are TermOfOffice, BackScene, ChessPlayer, PokerPlayer, TeamMember, Saint, FictionalCharacter, MythologicalFigure, OrganisationMember, Religious, Baronet, Medician, Professor, Embryology, Lymph, Constellation, Galaxy, ElectionDiagram, Olympics, OlympicEvent, ControlledDesignationOfOriginWine, PublicServiceInput, PublicServiceOutput, and ProgrammingLanguage. - Some class relationships are missing. For example, TeamMember is unrelated to SportsTeamMember even though they are both supposed to be members of athletic teams. Some other examples of missing relationships are between BullFighter and Bullfighter, between Host and TelevisionHost, and between Comic and Comics. The missing relationships should be provided or the classes merged. - Place is a rather unnatural union. It should either be removed or better organized. - There are quite a few subclasses of Building that are not truely buildings, including AmusementParkAttraction, Casino, Factory, Hotel, MilitaryStructure, Abbey and the other religious places of worship, Restaurant, ShoppingMall, and Venue. Similarly, there are a number of subclasses of ArchitecturalStructure that may not be architectural structures, including Garden, PublicTransitSystem, and Park. There are a few subclasses of NaturalPlace that are not necessarily natural places, including Canal, and even Lake. These classes should be moved up in the ontology. - The subclasses of Species are not collections of species. The subclasses should either be modified or moved elsewhere in the ontology. - The normal definition of PopulatedPlace is much too narrow to encompass all its subclasses. A new general class should be created to encompass the subclasses and PopulatedPlace be modified as necessary. - There are a number of strange top-level or second-level classes. These classes should be examined to ensure that they make sense. Many of these classes appear to be somehow related to measurements, including Altitude, Area, Blazon, ChartsPlacement, Demographics, Depth, GrossDomesticProduct, GrossDomesticProductPerCapita, HumanDevelopmentIndex, Population, Sales, Statistics, and Tax. Other strange classes include LifeCycleEvent, Imdb, Listen, PenaltyShootOut, PersonFunction, PoliticalFunction, Profession, TopicalConcept, Type, and YearInSpaceflight. Even if I had editing rights to the ontology I think that the fixes I have outlined above go beyond what should be done without some discussion. Comments? peter uHi Peter, Thank you for your detailed report. The DBpedia ontology is (a) crowdsourced and (b) follows a data-driven approach. Classes and properties are mainly derived from the actual data coming from different Wikipedia chapters. Those are the main reasons of the issues you mentioned. It would be great if you could contribute a deep analysis and detect the inconsistencies. In this way, we could clean the ontology up and provide rock solid semantics. As you already mention lots of examples, a brand new ontology and exact deltas with the current one would be highly beneficial. Cheers! On 4/10/14, 12:17 AM, Patel-Schneider, Peter wrote: uHi Peter and Marco, Except for a few minor points, I agree completely with the findings of Peter and Marco's comments. So, if the majority of us agree, let us start to do something about it. Since such an endeavour would be a good starting point for doing things better, how about trying to make the DBpedia ontology a little bit closer to existing classifications? For instance, those that exist in the field of describing scientific disciplines and areas of knowledge, like the UDC Linked Data Summary? ( Regards, Gerard Van: Marco Fossati [ ] Verzonden: donderdag 10 april 2014 15:14 Aan: Onderwerp: Re: [Dbpedia-discussion] A quick analysis of the classes in the DBpedia ontology Hi Peter, Thank you for your detailed report. The DBpedia ontology is (a) crowdsourced and (b) follows a data-driven approach. Classes and properties are mainly derived from the actual data coming from different Wikipedia chapters. Those are the main reasons of the issues you mentioned. It would be great if you could contribute a deep analysis and detect the inconsistencies. In this way, we could clean the ontology up and provide rock solid semantics. As you already mention lots of examples, a brand new ontology and exact deltas with the current one would be highly beneficial. Cheers! On 4/10/14, 12:17 AM, Patel-Schneider, Peter wrote: uOn Apr 10, 2014, at 6:14 AM, Marco Fossati < > wrote: I don't think that this last is true. For example, there are about 250 classes in the ontology that do not have any instances, at least in the data that I have examined, and more that have no instances that are not instances of any of their subclasses. There are also a number of places where the ontology organization does not match the information in Wikipedia. The DBpedia ontology looks much more like a unreviewed crowdsourced artifact than an artifact that matches either the information in Wikipedia or in DBpedia. Well, there are no formal inconsistencies in the DBpedia ontology, as it is too inexpressive to have inconsistencies. All that can be done is pointing out where the ontology does not appear to match either the normal definitions of the categories or the definitions found by examining Wikipedia information and differences between different parts of the ontology. I do have a list of all the empty classes, but this information is available elsewhere. The analysis that I sent out yesterday lists quite a number of deficiencies in the ontology. This analysis should be good enough to serve as a start on fixing the ontology. I agree that this would be a good idea. However, this should be an iterative process, and should not depend on a complete analysis. Well, producing a new ontology is work, and it would be nice that this work has an effect. This is why I was wondering who else was interested in improving the ontology. peter uOn 4/10/14 4:03 PM, Patel-Schneider, Peter wrote: I would hope the answer is: everyone that uses DBpedia . This is a classic crowd-sourcing affair. It is going to require contributions from a myriad of participant profiles. The process will be iterative with no final completion date, since that's an contradictory goal :-) uHi, I think that Peter has good reasons to complain on the current status of the DBpedia ontology :) But besides debatable choices on names, I’d concentrate on the main issue, which is the data-grounding of the ontology. The major example is that there is no systematic checking of the relation between domain/ranges and the way properties are actually used (although the situation has improved in 3.9). An organization of classes should be based firstly on how properties are used, after all we are talking primarily linked data, not taxonomies. Some research has been done on this, and I will contribute some literature to the community if any activity starts on reconciling crowd-sourcing, automatic extraction of data, and good practices of ontology design. Best Aldo On Apr 10, 2014, at 10:03:05 PM , Patel-Schneider, Peter < > wrote: uHello Peter, thank you very much for your inputs. The state of the DBpedia ontology is certainly an issue. You can register at [1], ask for editing rights, and go on and make your changes. I'd also feel not quite well performing major changes or removing classes without some discussion, since it is the effort of others and it is not always clear, if somebody actually uses it. Maybe, we could organize an ontology enhancement and guidelines workshop at the next DBpedia Community Meeting in Leipzig [2]. [1] [2] On Apr 10, 2014, at 10:03:05 PM , Patel-Schneider, Peter < > wrote: SELECT DISTINCT ?type WHERE {?type a owl:Class. FILTER NOT EXISTS {?subject a/rdfs:subClassOf* ?type.} } As for the English DBpedia dataset there are 142 unused classes: As Marco said, you'd also need to consider other Language Chapters that use the same ontology. But obviously there are some classes needless, redundant, badly described, or just wrong. Best regards Magnus uOn 4/11/14, 11:47 AM, Magnus Knuth wrote: +1 uOn 4/11/14 5:47 AM, Magnus Knuth wrote: [1] Triples relating to fixes and new relations are first created in an RDF document that 's WWW accessible [2] The documents are announced here as an \"invite to examine\" request etc[3] If acceptable, the Triples are added to the project [4] If unacceptable, for any reason, then in the very worst case (re. deadlocks) the Triples end up where they are on in a specific named graph in the Virtuoso instance . The real beauty of AWWW, as exemplified by Linked Data, is the ability to \"agree to disagree\" without creating inertia. Virtuoso can handle many Linked Data scenarios, and \"agreeing to disagree\" lies at the core of its design (like AWWW). Peter: we would all gladly welcome your input and contributions. The steps above will make this smooth and ultimately enlightening :-) uThis proposal illustrates one of the major problems with the DBpedia ontology - triples check in but they never check out. Many of the problems with the DBpedia ontology require changes. Other problems have to do with the expressivity of the ontology. I don't think that changes here can be effected just by adding and removing bits of the ontology. Other problems have to do with the philosophy of the ontology. peter On Apr 11, 2014, at 5:27 AM, Kingsley Idehen < > wrote: uOn 4/11/14 2:12 PM, Patel-Schneider, Peter wrote: Sorta, because of the misconception that SPARQL is steal Read-Only. I spend a good chunk of my day writing SPARQL 1.1 using INSERT, DELETE, and UPDATE (via INSERT and DELETE combos) to massage data, across many data spaces. Change is good. And it will work. Remember, DBpedia deploys Linked Data using a Quad Store, so the re-write rules and SPARQL queries used to perform the name->address indirection are extremely flexible. For instance, you can actually set the named graph URI scope for these SPARQL queries explicitly or via our NOT FROM NAMED GRAPH extension re., negation etc You can have a list of Named Graphs or excluded Named Graphs when generating the description of a DBpedia Entity URI's referent. Again, its just triples to which SPARQL 1.1 patterns can be applied. In short, this is the way to truly appreciate the power of SPARQL and Linked Data. You can variants of the Ontology, an alternative Ontology, it doesn't matter, the Linked Data deployment will be unaffected. The philosophy of the Ontology cannot change, that's a world view of the ontology creators. That doesn't stop another ontology existing as an alternative set of \"context lenses\" into the same data. I encourage you to make your changes, or make a new ontology, whichever path you take, the end product will be useful and a showcase for perspectives sometimes overlooked due to blurred and blurry perspectives :-) We can do this, its the next stage in the natural evolution of DBpedia and the broader Linked Open Data Cloud. Note: there is zero speculation in my response. I've already done (and continue to do) a lot of this (hands on fashion) over the years, following LOD cloud initial bootstrap. BTW uHello, Regarding class usage, the Dutch chapter does a great job already and Magnus' query returns no results;) [1] This means that all classes are needed at the moment. Regarding the changes, I am also in favor to move forward and fix all inconsistencies. Let's start already with the obvious ones and discuss any major changes in the DBpedia meeting. Changing PopulatedPlace for instance will break many applications but if this is the way to go I am also in Cheers, Dimitris [1] .org&query;=SELECT+DISTINCT+%3Ftype+WHERE+%7B%3Ftype+a+owl%3AClass.+FILTER+NOT+EXISTS+%7B%3Fsubject+a%2Frdfs%3AsubClassOf*+%3Ftype.%7D+%7D%0D%0A&format;=text%2Fhtml&timeout;=0&debug;=on On Sat, Apr 12, 2014 at 3:58 AM, Kingsley Idehen < >wrote: uOn Apr 11, 2014, at 11:58 AM, Kingsley Idehen < > wrote: How can I use SPARQL 1.1 to change the DBpedia ontology? I'm not sure how any of this can be used to effect changes in the DBpedia ontology. Again, I'm not sure what play SPARQL 1.1 has with respect to the expressive power of the ontology. Well, sure, none of this will affect most Linked Data uses, but that's not what I'm interested in. I'm interested in using the DBpedia ontology to organize information. I could, of course, simply use a different ontology, but my hope here is that use of the DBpedia ontology in products will result in a better ontology, and that that can be shared. It appears that the current ontology does not match the stated philosophy of the ontology. One or the other should change, and probably both. I would love to make changes. There actually is a modified version of the ontology that is in use. peter uOn 04/13/2014 06:32 AM, Dimitris Kontokostas wrote: Here are two places in the ontology where there are classes that should not be subclasses of their superclasses: CelestialBody: From Typically, an astronomical (celestial) body refers to a single, cohesive structure that is bound together by gravity (and sometimes by electromagnetism). Examples include the asteroids, moons, planets and the stars. Astronomical objects are gravitationally bound structures that are associated with a position in space, but may consist of multiple independent astronomical bodies or objects. These objects range from single planets to star clusters, nebulae or entire galaxies. From this, a galaxy is not a celestial body, but it is a celestial object. A constellation is neither a celestial body nor a celestial object. Unfortunately, this is not a very the use of this phrase appears to vary somewhat, and the Wikipedia page does not point to a source for the definition. Some usages appear to allow constellations to be celestial bodies. My suggestion is that CelestialBody be renamed CelestialObject and that constellation be moved outside of it. Place and its subclasses: From Settlement, locality or populated place are general terms used in statistics, archaeology, geography, landscape history and other subjects for a permanent or temporary community in which people live or have lived, without being specific as to size, population or importance. A settlement can therefore range in size from a small number of dwellings grouped together to the largest of cities with surrounding urbanized areas. The term may include hamlets, villages, towns and cities. The term is used internationally in the field of geospatial modeling, and in that context is defined as \"a city, town, village, or other agglomeration of buildings where people live and work\".[1] From this, just about all subclasses of PopulatedPlace are not populated places. The only exception is settlement. From A building is a man-made structure with a roof and walls standing more or less permanently in one place. Buildings come in a variety of shapes, sizes and functions, and have been adapted throughout history for a wide number of factors, from building materials available, to weather conditions, to land prices, ground conditions, specific uses and aesthetic reasons. To better understand the term building compare the list of nonbuilding structures. Most subclasses of Building are not either not buildings, e.g., WindMotor and Treadmill, or are not always buildings, e.g., ShoppingMall. Many subclasses of Place also conflate function with structure, e.g, Church. The physical presence of many churches are not buildings, and some churches do not even have a physical structure associated with them at all. My suggestion is that this entire portion of the ontology needs to be revamped. Suggested changes include moving most of the subclasess out from under PopulatedPlace and making PopulatedPlace equivalent to Settlement, moving many of the subclasses out from under Building, and changing the names of many of the subclasses to better reflect their intended meaning. uOn 4/13/14 8:32 PM, Patel-Schneider, Peter wrote: You can use SPARQL 1.1 (from your SPARQL 1.1 compliant application) to generate triples is a named graph local to your application, based solutions returned to you from the public SPARQL endpoint. You can use LOAD, INSERT etcto produce your local RDF statements expressing whatever you have in mind. Once done, you can publish an RDF document for incorporation back into the DBpedia project etc Until you make a local copy, as I described above, you will believe the statement to be true. All you are doing (ultimately) is make changes in a document and then sending them over for incorporation. In the very worst case, you RDF document content will still be part of the LOD Cloud as long as you publish it on the Web. The first step is making an RDF document with the alternative view that you seek. SPARQL 1.1 let's you make new RDF statements from existing RDF statements. The DBpedia ontology is a collection of RDF statements. Yes, and information lives in documents, right? Thus, you simply grab the relevant data from DBpedia, massage it (using SPARQL 1.1 or other means) and then you have a new document (comprised of new or revised RDF statements). Yes, why is contributing tweaks to the DBpedia ontology using an RDF document produced by you not an option here? Do that, and everything else falls into place. Can't you reflect that in an RDF document submitted to the project? Is this ontology represented in an RDF document that's accessible via an HTTP URL, at this point in time? If it exists, then we are nearly there. Kingsley uOn 4/14/14 7:02 AM, Kingsley Idehen wrote: Important typo fix. I meant to say: You can use SPARQL 1.1 (from your SPARQL 1.1 compliant application) to generate triples *in* a named graph local to your application, based *on* solutions returned to *your app* from the public SPARQL endpoint. I do this all the time when editing definitions (for classes and properties) oriented triples in ontologies. Typical example, where I am adding missing rdfs:isDefinedBy, voca:defines, and wdrs:isdescribedby relations to an existing shared ontology. Note: I am using Virtuoso as my SPARQL 1.1 compliant application (so I have the ability to grab data from any SPARQL endpoint and then use SPARQL 1.1 for local named graph scoped INSERT and DELETE operations): The Identity of Resources on the Web ontology (IRW). # Ontology URI: # Ontology Document URL: I use SPARQL 1.1 LOAD to get the data into Virtuoso LOAD ; WITH GRAPH INSERT { ?s . ?s. a owl:Ontology . ?s . foaf:primaryTopic ?s . } WHERE { {?s ?o} UNION {?s ?o} UNION {?s ?o} UNION {?s ?o} UNION {?s a ?o} UNION { ?s ?o} UNION {?s ?o} } COSMO Ontology Ontology URI: Ontology Document URL: Using SPARQL 1.1 LOAD to get data into Virtuoso LOAD ; WITH GRAPH INSERT { ?s . ?s. a owl:Ontology . ?s . foaf:primaryTopic ?s . } WHERE { {?s ?o} UNION {?s ?o} UNION {?s ?o} UNION {?s ?o} UNION {?s a ?o} UNION { ?s ?o} UNION {?s ?o} } uAaah, sure I can use SPARQL 1.1 to massage triple stores, including triple stores that use IRIs from the DBpedia ontology. In this way, I could modify the results, perhaps to make them look like certain stuff had been removed from the DBpedia ontology, although this process can result in systematic errors if the triples include inferred triples. My question was, however, what SPARQL 1.1 has to do with changing the DBpedia ontology itself. peter On Apr 14, 2014, at 5:42 AM, Kingsley Idehen < > wrote: uOn 4/14/14 12:07 PM, Patel-Schneider, Peter wrote: Answer: It enables you import the data in question, conditionally (via SPARQL query pattern solution), en route to crafting a new ontology or tweaking the existing ontology. You are ultimately going to be doing at least one of the following (locally): 1. adding new RDF statements 2. deleting existing RDF statements 3. updating existing RDF statements (via conditional INSERT and DELETE). Of course, you can forget SPARQL and just import the lot and editing by hand etc Fundamentally, you can fix DBpedia's ontology by contributing your fixes in the form of RDF statements for consideration by the maintainers. If rejected (for whatever reasons) you can still publish your RDF statements via an RDF document to some location under your control on the Web. Your revised ontology is your set of \"context lenses\" into the DBepdia dataset. If you don't want to craft the new ontology or fixes to the existing ontology, using the methods suggested, how else can you expect this to happen? Kingsley uKingsley's approach is one way to go but I think we should focus on fixing the ontology in the source, which is the mappings wiki. The German chapter is working on ways to automatically import axioms in the mappings wiki, so of course this is an option too. I think you already pointed out most of your suggested changes so we can start working on these. [1] Cheers, Dimitris [1] On Mon, Apr 14, 2014 at 9:46 PM, Kingsley Idehen < >wrote: uWe can more or less do this since we are already inserting labels into the ontology in batch mode [1]. It would be quite helpful for us if someone interested in editing the ontology programmatically, would produce a list of changes (class name and changed properties, in any kind of machine readable format you want, even Sparql would be nice). We could then better experiment and introduce the changes directly into the mappings wiki. Changing the ontology in the triplestore would create a synchronization problem, by the next extraction those changes would need to be recomputed and reintroduced. Cheers, Alexandru [1] On Tue, Apr 15, 2014 at 8:06 AM, Dimitris Kontokostas < >wrote: uOn 4/15/14 1:06 AM, Dimitris Kontokostas wrote: We should use this episode to make a clearer guidelines for evolving DBpedia. Options: 1. In the source" "literatur extraction" "uDear dbpedia team, I have a question concerning the extraction of literatur information from wikipedia pages. How can I get an integrated view of the information concerning the data inside the literature template. For example - the article includes a {{Literatur | Autor=Martin Ester, Jörg Sander | Titel=Knowledge Discovery in Databases. Techniken und Anwendungen | Verlag=[[Springer Science+Business Media|Springer]] | Ort=Berlin | Jahr=2000 | ISBN=3-540-67328-8}} After your extraction and mapping processes each data element is separated from each other. As far a I can see, there is no chance to reconstruct the literatur information each other (just literals). thanks a lot in advance, Robert uOn Fri, Jan 4, 2013 at 6:54 AM, Robert Glaß < > wrote: I don't know the answer, but there are more problems than just the fact that the property values aren't grouped together. The ISBNs are being corrupted because they're extracted as integers instead of strings and the author property is all author names together rather than individual values for each author. Tom uDear Robert, Tom, The default behaviour for template extraction is to attach the template properties to the main resource while what you are looking for is actually a \"blank node\" with the template properties attached to the main resource. It is not currently supported by the framework but can be implemented easily with a custom extractor. This could be a nice to have feature but is not in our immediate priorities. However, we could help you / give you directions if you want to implement it. (the tricky part would be to process the data to locate duplicates) @Tom As for the integer problem, this is a common behaviour for the infobox extractor where type is inferred with heuristics. Custom extractors and the mappings extractor do not have this problem. Best regards, Dimitris Kontokostas On Fri, Jan 4, 2013 at 4:28 PM, Tom Morris < > wrote: uHi, uOn Thu, Jan 10, 2013 at 5:40 PM, Julien Cojan < > wrote:" "Lexicalizations Dataset" "uHello, I'm looking for the DBpedia Lexicalizations dataset, but all the urls I find redirect to which throws a 404. Could someone share the link with me? Hello, I'm looking for the DBpedia Lexicalizations dataset, but all the urls I find redirect to me? uHi, I did. Although I can recreate de lexicalizations dataset using the data available there, I would still prefer having the original data, as I'm trying to replicate a paper that used it. Maybe it is available in the repository you sent me, but I couldn't find it (found only the raw data described in that wiki). Thanks, Daniel On Mar 29, 2016 09:31, \"Ghislain Atemezing\" < > wrote: uHi Daniel, Did you look at this wiki Or even this one? Ghislain uHi Daniel, Where did you get this broken link? Is it by any chance the nlp2014 folder here: On Mon, Mar 28, 2016 at 1:59 PM, Daniel Ferreira < > wrote: uHi, I got the broken links from I'm mostly interested in getting the same pmi's as described in the oldwiki. I couldn't find what I wanted in the link you gave me, although the folder name sounded promising. Thanks, Daniel On Tue, Mar 29, 2016 at 3:10 PM, Dimitris Kontokostas < > wrote: uwe recovered these links as well On Tue, Mar 29, 2016 at 5:26 PM, Daniel Ferreira < > wrote:" "Become an editor" "uHi, How can I get editor rights in mappings.dbpedia.org for the user name gerb? Cheers, Daniel uHi Daniel, you got editor rights. You can also create your user wiki page now. Happy mapping! Anja On May 17, 2011, at 10:55 AM, Gerber Daniel wrote:" "what is loaded into dbpedia?" "uDoes anyone know which of the dbpedia export files have been loaded into the dbpedia endpoint? thanks, tim Does anyone know which of the dbpedia export files have been loaded into the dbpedia endpoint? thanks, tim uHi Tim, you can find a list of all loaded files here: Cheers, Anja On Oct 28, 2011, at 7:08 AM, Tim Harsch wrote:" "Getting amount of links between DBPedia instances (page_links_en.nt)" "uHi, is it possible (in a appropriate way, i'm not familiar with wikipedia extraction (framework)) or what is the easiest way for getting the amount of each link which is linking from one to another DBPedia instance. In the Wikipedia Pagelinks dump you have only the information, that one DBPedia instance is linking to another DBPedia instance, but the cardinality is missing for me, i like to know how many times one DBPedia instance is linking to another DBPedia instance. Do you know a easy way for this? best regards and thank you for replies! Gafur uHi Gafur, On 12/03/2012 09:35 PM, wrote: The English pagelinks dump [1] is not loaded into the official DBpedia endpoint [2], as it's quite large. So, I would suggest that you establish your endpoint, i.e. download the DBpedia dumps and load them into a Virtuoso [3] instance installed on one of your own machines, and perform your queries against it instead. And then you can use the following SPARQL query to get the required information: select ?s, count(?o) as ?numOfLinks where {?s dbpedia-owl:wikiPageWikiLink ?o. } group by ?s limit 10 But please take care that the former query will talk a lot of time, so it's way better if you use it in conjunction with a specific resource. So, if you are looking for the number of links of resource \"Paris\" for example, you should use the following query instead: select count(?o) as ?numOfLinks where {dbpedia:Paris dbpedia-owl:wikiPageWikiLink ?o. } Hope that helps. [1] [2] [3] uHello Gafur, Am 04.12.2012 05:35, schrieb : There are several metrics that you could want: 1. Each DBpedia instance is only linked maximally once to each other instance via pagelinks as there can not be duplicate triples. 2. You can count in and out degree, i.e. number of pagelinks per instance as subject or object respectively. Unix is you friend here ( wget bzcat page_links_en.nt.bz2 | cut -f1 -d '>' | sed 's/ //' | awk '{count[$1]++}END{for(j in count) print \"<\" j \">\" \"\t\"count [j]}' > outdegree_subjects.tsv" "Linking To Items From Wikipedia Lists" "uHi All, Is there currently a strategy in place for explicitly linking entities referred to in a Wikipedia list? For example, I just went looking for a quick way to visualize one image representation of each element, but the data came up short. This list exists, but only has metadata about the list itself: and when navigating to a specific element: there is only a type \"Thing\" specified. In the absence of an explicit entity type, having a list that actually enumerated its items would be extremely useful, and of course there are a lot of lists on Wikipedia. Am I missing a structure here, or does it not exist yet? Cheers, - Sands Fish - Data Scientist / Software Engineer - MIT Libraries - - E25-131 body{font-family:Helvetica,Arial;font-size:13px} Hi All, Is there currently a strategy in place for explicitly linking entities referred to in a Wikipedia list? For example, I just went looking for a quick way to visualize one image representation of each element, but the data came up short. This list exists, but only has metadata about the list itself: E25-131 uHi Sands, as far as I know there is no strategy for extracting entity types for elements in Wikipedia lists articles. In my opinion it would be interesting to develop such an idea. Main tasks would be: - correctly identify lists articles (only based on article title?) - extract elements from lists articles (all the wikilinks in the list articles?) - mapping of list context to a DBpedia type. That is possibly the hardest part. Should NLP be used? Maybe this could become a GSoC project. Do you have any approach to propose? BTW, looks like the Hydrogen article is using Infobox_hydrogen which is not in the DBpedia mappings, hence no type is extracted for that entity. Cheers Andrea Il 28/gen/2014 20:27 \"Sands Alden Fish\" < > ha scritto: uHello Sands, Wikipedia lists are tricky. the following paper gives an overview of the problems Extending DBpedia with Wikipedia List Pages (Position Paper) ( from Heiko Paulheim and Simone Paolo Ponzetto This is from the proceedings of the NLP & DBpedia 2013 workshop IIRC Wikidata is supposed to provide a list mechanism to Wikipedia, which will then be easier to extract For Dutch, some List pages use custom templates for rendering and we exploit the mappings wiki to generate them e.g. Cheers, Dimitris On Wed, Jan 29, 2014 at 9:48 AM, Andrea Di Menna < > wrote:" "request for Russian namespace (ru)" "uHello, DBPedia Maintainers, Thank You for fast action (I have received editor's right at mappings.dbpedia.org). The second thing which I have to ask You is create the Russian language namespace (ru) in the framework and on the wiki (information about it is placed in section \"Main Page\" -> \"4. Mappings for new languages\"). Then I can start to create new mappings in Russian namespace and continue to work with Greece team together. Sincerely yours, Dmitry Belyakov. uNo problem. There is now a Russian namespace. It's great that you are willing to contribute! Best, Max On Thu, Mar 10, 2011 at 09:17, dmitry < > wrote:" "Tools, API or Web service to get DBpedia URI of entity from keywords" "uHi, Which tools, API or Web service is able to get a BDpedia URI of an entity of others online free dataset with keywords. For example getting like USA or US. I have found DBpedia lookup.Is there another tools for others online databases? Thanks. uHey, isn't this what DBPedia Spotlight does? Martynas graphity.org On Thu, Apr 25, 2013 at 5:37 PM, Olivier Austina < > wrote: uHow mature is Spotlight? If I simply type \"Africa\" into the demo [ If I type in \"We are in Africa\" under the same conditions, it annotates \"in Africa\" with the resource URL for Africa. Hm. - Sands Fish - MIT Libraries - / @sandsfish / www.sandsfish.com From: Martynas Jusevièius [ ] Sent: Thursday, April 25, 2013 11:53 AM To: Olivier Austina Cc: DBpedia Subject: Re: [Dbpedia-discussion] Tools, API or Web service to get DBpedia URI of entity from keywords Hey, isn't this what DBPedia Spotlight does? Martynas graphity.org On Thu, Apr 25, 2013 at 5:37 PM, Olivier Austina < > wrote:" "Important Change to HTTP semantics re. hashless URIs" "uAll, Here is a key HTTP enhancement from Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content note from IETF [1]. \" 4. If the response has a Content-Location header field and its field-value is a reference to a URI different from the effective request URI, then the sender asserts that the payload is a representation of the resource identified by the Content-Location field-value. However, such an assertion cannot be trusted unless it can be verified by other means (not defined by HTTP). \" Implications: This means that when hashless (aka. slash) HTTP URIs are used to denote entities, a client can use value from the Content-Location response header to distinguish a URI that denote an Entity Description Document (Descriptor) distinct from the URI of the Entity Described by said document. Thus, if a client de-references the URI and it gets a 200 OK from the server combined with in the Content-Location response header, the client (user agent) can infer the following: 1. denotes the real-world entity 'Barack Obama' . 2. denotes the Web Document that describes real-world entity 'Barack Obama' uOn 3/25/13 12:12 AM, Pat Hayes wrote: uOn 3/25/13 4:42 AM, Mo McRoberts wrote: Yes-ish. I say so that because you are correct with regards to the excerpt above. That said, here's an excerpt (further down in the note) that does hone into the matter that's challenged hashless HTTP URI based entity denotation for some time now re., Linked Data: \"If Content-Location is included in a 2xx (Successful) response message and its field-value refers to a URI that differs from the effective request URI, then the origin server claims that the URI is an identifier for a different resource corresponding to the enclosed representation. Such a claim can only be trusted if both identifiers share the same resource owner, which cannot be programmatically determined via HTTP.\" Basically, the role of the Content-Location header with regards to Linked Data URI disambiguation heuristics is now much cleaner. Anything to do with Trust ultimately lays the foundation for WebID and WebID+TLS value proposition comprehension and appreciation :-)" "Ontology2 Releases RDF Dump of Nearly 1, 000, 000 Free Images" "uOntology2 announces the beta release of the Ookaboo RDF Dump, which contains metadata for nearly 1,000,000 public domain and Creative Commons images of more than 500,000 specific topics from Dbpedia and Freebase. The Ookaboo RDF dump is released under a CC-BY-SA license that is friendly to both academic and commericial use. With precision in excess of 0.98, Ookaboo enables entirely new applications for image search and classification. The 2012-01-23 beta release of Ookaboo contains detailed documentation and a SPARQL query cookbook that make it easy to download, install and build applications based on the dump. The 2011-01-23 beta release has been tested on Virtuoso OpenLink 6.1.4. Ontology2 founder Paul Houle says \"the RDF dump will be qualified against other leading triple stores before it gets out of beta. We want to work with vendors and users to produce a product of unprecedented quality that realizes the promise of Linked Data.\" The Ookaboo RDF dump is available at Please address inquiries to DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" u0€ *†H†÷  €0€1 0 +" "COUNT(*) with ORDER BY" "uHello, I'm having a little trouble forming a query that will tally how many pagelinks there are per page. The following query works to grab the links themselves: SELECT ?label ?uri ?link FROM WHERE { ?uri skos:subject . ?uri rdfs:label ?label . ?link dbpedia2:wikilink ?uri . FILTER (langMatches(lang(?label), \"en\")) } And this one works to grab what seem to be accurate counts: SELECT ?label ?uri count(?uri) as ?count FROM WHERE { ?uri skos:subject . ?uri rdfs:label ?label . ?link dbpedia2:wikilink ?uri . FILTER (langMatches(lang(?label), \"en\")) } GROUP BY ?label ?uri But when I try to order these results by count, all counts are the same (6 in this case): SELECT ?label ?uri count(?uri) as ?count FROM WHERE { ?uri skos:subject . ?uri rdfs:label ?label . ?link dbpedia2:wikilink ?uri . FILTER (langMatches(lang(?label), \"en\")) } GROUP BY ?label ?uri ORDER BY ?count Any ideas? It's still a little boggling to me how aggregates are supposed to work I'm probably just butchering the documentation [1] and interpreting it incorrectly. Matt [1] rdfsparqlaggregate.html uMatt Mullins wrote: Matt, Its a darn bug :-( Kingsley uKingsley, Darn indeed! Glad I uncovered it. uHello Matt, Aggregating by grouped expression is formally senseless and should signal an error, The problem is that the error is not signaled, the error diagnostics should be improved. The compiler go crazy and result from one grouping is used for all groups. This works: SELECT ?label ?uri count(1) as ?count FROM WHERE { ?uri skos:subject . ?uri rdfs:label ?label . ?link dbpedia2:wikilink ?uri . FILTER (langMatches(lang(?label), \"en\")) } GROUP BY ?label ?uri ORDER BY ?count Best Regards, Ivan. P.S. The real bug is that correct query SELECT ?label ?uri count(?link) as ?count FROM WHERE { ?uri skos:subject . ?uri rdfs:label ?label . ?link dbpedia2:wikilink ?uri . FILTER (langMatches(lang(?label), \"en\")) } GROUP BY ?label ?uri ORDER BY ?count does not work :| On Thu, 2009-08-06 at 15:40 -0400, Kingsley Idehen wrote: uIvan, That works wonderfully, thank you! I could've sworn I tried count(*) (which works, too) but I was probably changing too many variables at once trying to troubleshoot this. Your second query definitely was one of the ones I tried and makes the most sense to me. Thanks again, Matt uMatt Mullins wrote: uIssues again! This is the current version of my query. (A reminder: I'm trying to grab a list of all the articles that are a member of Category:Pasta, ordered first by number of inlinks each of those articles has and then by label): SELECT ?uri, ?label, count(*) as ?inlinks FROM WHERE { ?uri skos:subject . OPTIONAL {?uri rdfs:label ?label} . OPTIONAL {?inlink dbpedia2:wikilink ?uri} . FILTER (langMatches(lang(?label), \"en\")) } ORDER BY DESC(?inlinks) ASC(?label) This query gives me the correct results. I have an OPTIONAL around the ?label retrieval out of trial and error. This query (same thing without the optional): SELECT ?uri, ?label, count(*) as ?inlinks FROM WHERE { ?uri skos:subject . ?uri rdfs:label ?label . OPTIONAL {?inlink dbpedia2:wikilink ?uri} . FILTER (langMatches(lang(?label), \"en\")) } ORDER BY DESC(?inlinks) ASC(?label) Yields some weird results. Scroll down to :Al_forno or :Al_dente. Why are their labels \"Campanelle\" and \"Fiori (pasta)\"? :Passatelli and :O.B._Macaroni also have mismatched labels. Their connection? They are the only articles in the results that don't have any inlinks (ideally their inlink count would be 0 but I don't know how to form a query to do that). If I take ?label out of the ORDER BY the mismatching is fixedbut I don't have my desired ordering: SELECT ?uri, ?label, count(*) as ?inlinks FROM WHERE { ?uri skos:subject . ?uri rdfs:label ?label . OPTIONAL {?inlink dbpedia2:wikilink ?uri} . FILTER (langMatches(lang(?label), \"en\")) } ORDER BY DESC(?inlinks) Is this another bug? Or am I botching the SPARQLor both again? I appreciate the help, Matt uHi Matt, When I run your queries as is against the DBpedia SPARQL endpoint ( 37000 Error SP030: SPARQL compiler, line 8: Undefined namespace prefix at 'dbpedia2' before '?uri' If I change dbpedia2 to dbpedia then they run, why are you using dbpedia2 ? Anyway with this change I can see what appears to be an issue with your second query, but in my case the same label \"Pastina\" is returned for all results, which is incorrect. We shall look into this although would like to know why you seems to see different incorrect data than I, could this \"dbpedia2\" prefix be related ? Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 12 Aug 2009, at 06:31, Matt Mullins wrote: uHi Matt, OK I see your problem, you must be using the snorql endpoint ( dbpedia.org/snorql/) which has PREFIX dbpedia2: pre defined. Setting this for the Virtuoso SPARQL endpoint enable me to see your issue also, so more to be looking into Best Regards Hugh Williams Professional Services OpenLink Software Web: Support: Forums: On 12 Aug 2009, at 13:41, Hugh Williams wrote:" "Unable to load all triples for wikipedia pagelinks" "uHello, I have installed a dbpedia endpoint and I am trying to load the dbpedia datasets I have downloaded. (I downloaded the 3.7 dbpedia datasets). While all other files are loaded correctly, when I try to load \"page_links_en.nt\" the loader loads only 4951244 triples instead of 145.9M triplets. I tried redownloading the file and reloading it to the virtuoso endpoint and still I got the exact same number of triples. I load the files using the isql commands: ld_dir('absolut/path/to/folder', '*.*', 'graphname'); rdf_loader_run(); I used the same commands for all a=other datasets and they worked fine. Is there any problem with the file I downloaded or sth else I am doing wrong? Thanks a lot for your help, Chryssa Hello, I have installed a dbpedia endpoint and I am trying to load the dbpedia datasets I have downloaded. (I downloaded the 3.7 dbpedia datasets). While all other files are loaded correctly, when I try to load 'page_links_en.nt' the loader loads only 4951244 triples instead of 145.9M triplets. I tried redownloading the file and reloading it to the virtuoso endpoint and still I got the exact same number of triples. I load the files using the isql commands: ld_dir('absolut/path/to/folder', '*.*', 'graphname'); rdf_loader_run(); I used the same commands for all a=other datasets and they worked fine. Is there any problem with the file I downloaded or sth else I am doing wrong? Thanks a lot for your help, Chryssa uHi Chryssa, What does the “load_list” table report as the status of loading that dataset ? Please also refer to the following tip on the Virtuoso “ShortenLongURIs” param which is required for some of the DBpedia 3.7 datasets for them to load fully: Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // 10 Burlington Mall Road, Suite 265, Burlington MA 01803 Weblog uIf you mean the ll_state of the load_list table, it has the value 2. Also the ll_error has the value NULL. I am aware of the use of ShortenLongURI's parameter, and I had no problems with long URI's. Also while loading there was no error message. Óôéò 14 Ìáñôßïõ 2012 11:35 ð.ì., ï ÷ñÞóôçò Hugh Williams < > Ýãñáøå: uIl 14/03/2012 10:17, Chryssa Zerva ha scritto: Did you get any errors from the table load_list? Try with: SELECT * FROM load_list and check if the field ll_errors is not null. You can also check the virtuoso.log file to see where the errors occurred. Which version of Virtuoso did you install? You should have at least version 6.1.4 opensource (it's the latest). Probably your problem depends on long URIs. In this case, follow the guide at: virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtTipsAndTricksGuideShortenLongURIs cheers, roberto uIl 14/03/2012 11:02, Chryssa Zerva ha scritto: What about the buffers? NumberOfBuffers MaxDirtyBuffers uI have fixed the NumberOfBuffers and the MaxDirtyBuffers according to this link (my settings are a bit lower than the suggested 4GB settings) Do the buffers configurations relate to the number of triples I am able to load?? I thought they just relate to the loading time. Should I change sth? Thanks a lot! Óôéò 14 Ìáñôßïõ 2012 12:28 ì.ì., ï ÷ñÞóôçò Roberto Mirizzi < > Ýãñáøå: uAndreas (in CC) seems to be loading wikiPageLink triples right now. Maybe he can advise. Cheers, Pablo 2012/3/14 Chryssa Zerva < > uAndreas, I would be grateful for any feedback or advice, since I have not figured out the problem yet! Pablo thanks as well! :) Óôéò 14 Ìáñôßïõ 2012 1:24 ì.ì., ï ÷ñÞóôçò Pablo Mendes < > Ýãñáøå: uIl 14/03/2012 15:34, Andreas Schultz ha scritto: I didn't figure out that problem before reading your emails. I checked this on my instance: wc -l page_links_en.nt gives me back 145877010 rows. While this query: SELECT COUNT(*) WHERE {?s dbpedia-owl:wikiPageWikiLink ?o} returns 118039661. Where are the remaining 28M triples? :-)" "Inter-language links (WikiData)" "uHi to all, I'm using the DBpedia extraction framework for the Esperanto DBpedia, but I noticed that I cannot access the inter-language links, that are stored in WikiData. In the English DBpedia the sameAs links are correctly set: how can I extract them for Esperanto? Is there a tutorial to do that? Thank you! Alessio uHi Alessio, If you only need the language link information, you could possibly make use of the Wikidata RDF dumps [1]. Currently, the most recent site link dump is [2]. It contains N3 of the following form for each linked page: . . \"sk\" . Information integration could be achieved through the Wikipedia page URL that you already have, or through the Wikidata IRI ( Cheers, Markus [1] [2] (Btw. \"site link\" is the general term, since Wikimedia runs sites other than Wikipedia that are linked as well) On 14.10.2014 12:00, Alessio Palmero Aprosio wrote: uThank you! I solved by downloading the cross-language links from here [1]. Best, Alessio [1] Il 14/10/14 16:42, Markus Kroetzsch ha scritto: uHello Alessio, BTW - the dumps at Wikimedia [1] include a separate SQL dump, that has all the interlingual links. The only thing you would have to implement is a (pretty simple) SQL parser, that extracts the links from that dump. You could also load the dump to MySQL end export the necessary data to a format of your choice. Cheers, Aleksander [1] uHi Alexander, On 14.10.2014 16:57, wrote: A bit more is needed. If you want the actual URLs, you need to also parse the sitelinks table dump, which resolves things like \"enwiki\" to \"en.wikipedia.org\" (you can find this dump on the dumps site as well). Moreover, if you want to find the language code for each site, you need to implement a conversion from Wikipedia language codes to official language codes (as used in RDF literals etc.); there are a number of exceptions where the two don't agree uThe way we do this since the 2014 release is with the WikidataLLExtractor. You only have to set wikidata as a language in the download config and then set only WikidataLLExtractor for the wikidata language in the extraction config. At the moment this generates all interwiki links between all languages defined in the mappings wiki. I suggest you go this way since it keeps the proper encoding (URI / IRI) and naming conventions for all languages. FYI, Ali Ismayilov (in cc) is working on the Wikidata / DBpedia integration Best, Dimitris On Tue, Oct 14, 2014 at 5:57 PM, < > wrote: uHi Markus, thanks for this clarification. In fact I haven't paid too much attention to the language codes, since so far I was interested in the most popular languages. It's good to know, that Wikidata dumps are more accurate in this respect. Cheers, Aleksander" "Like DBpedia? You'll love :BaseKB" "uWe've cracked the code of the Freebase quad dump and produced what we believe is the first correct conversion of Freebase into industry-standard RDF. By installing :BaseKB into any market-leading triple store, you can query Freebase with the powerful SPARQL 1.1 language uOn Thu, 12 Apr 2012 01:18:18 +0200, Paul A. Houle < > wrote: uIl 12/04/2012 01:18, Paul A. Houle ha scritto: I'm just downloading the KB. I cannot wait to see it in action! Meanwhile, what is the difference between the KB I'm downloading and the Freebase data dumps ( about a simple conversion between MQL and SPARQL ( Paul, a little typo in the documentation at \"is is inexpensive to add a large amount of RAM []\" uOn Thu, 12 Apr 2012 23:33:45 +0200, Paul A. Houle < > wrote: uOn 4/13/2012 6:35 AM, baran_H wrote: I think the pendulum will swing to and away from \"the cloud\" and I think there's a place for everything. Most advancements in hardware and software will help both local installations and public endpoints. The one thing that could help public endpoints would be operating as distributed main-memory databases, but for the economics of that to work out you need very high query volume. It comes down to the \"proof\" and \"trust\" parts of the Linked Data stack. Even if we don't have fully automated answers for these, the fact is that different linked data sources operate in different p.o.v.'s and that to maintain a system p.o.v. you need to decide what you \"trust\" to what extent. If there's a particular piece of the Linked Data web that's well behaved and well understood you can build a simple app that exploits it. In general the Linked Data web is a wild and wooly place and you need to do some data cleanup before you can write queries. So you need a system like Sindice which builds a knowledge base (in their case 50*10^9 triples) from a crawl. I see Linked Data as being more like a conversation between humans than a conversation between neurons. An agent working in this space needs to have some ability to ground terms, which means having either a 10^8+ triple 'generic database' or a beyond-state-of-the-art upper ontology of some kind. Of course. There's a lot of room for specialized techniques. Not long ago I'd figured out a really interesting calculation that could be expressed in SPARQL. Now, running this SPARQL query for all of the terms in our knowledge base would have taken 100 years but I had just 2 weeks to deliver a product. With more hardware and a different triple store, maybe I could have done it in 10 years or 5 years (and I would have blown the schedule simply negotiating for the software license with my boss and the vendor) Instead I developed a specialized algorithm that did the calculation in 24 hours. To go back to Sindice, they developed a framework for building a full-text index out of RDF data while bypassing the triple store. It's fast and very scalable. I've played up the option of loading :BaseKB into a triple store because it really is straightforward, flexible and a lot of fun. However, you can do really amazing things with this kind of RDF dump without the triple store, even on a 32-bit machine."