Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

771

772

773

774

775

776

777

778

779

780

781

782

# Natural Language Toolkit: Chat-80 KB Reader 

# See http://www.w3.org/TR/swbp-skos-core-guide/ 

# 

# Copyright (C) 2001-2012 NLTK Project 

# Author: Ewan Klein <ewan@inf.ed.ac.uk>, 

# URL: <http://nltk.sourceforge.net> 

# For license information, see LICENSE.TXT 

 

""" 

Overview 

======== 

 

Chat-80 was a natural language system which allowed the user to 

interrogate a Prolog knowledge base in the domain of world 

geography. It was developed in the early '80s by Warren and Pereira; see 

``http://www.aclweb.org/anthology/J82-3002.pdf`` for a description and 

``http://www.cis.upenn.edu/~pereira/oldies.html`` for the source 

files. 

 

This module contains functions to extract data from the Chat-80 

relation files ('the world database'), and convert then into a format 

that can be incorporated in the FOL models of 

``nltk.sem.evaluate``. The code assumes that the Prolog 

input files are available in the NLTK corpora directory. 

 

The Chat-80 World Database consists of the following files:: 

 

    world0.pl 

    rivers.pl 

    cities.pl 

    countries.pl 

    contain.pl 

    borders.pl 

 

This module uses a slightly modified version of ``world0.pl``, in which 

a set of Prolog rules have been omitted. The modified file is named 

``world1.pl``. Currently, the file ``rivers.pl`` is not read in, since 

it uses a list rather than a string in the second field. 

 

Reading Chat-80 Files 

===================== 

 

Chat-80 relations are like tables in a relational database. The 

relation acts as the name of the table; the first argument acts as the 

'primary key'; and subsequent arguments are further fields in the 

table. In general, the name of the table provides a label for a unary 

predicate whose extension is all the primary keys. For example, 

relations in ``cities.pl`` are of the following form:: 

 

   'city(athens,greece,1368).' 

 

Here, ``'athens'`` is the key, and will be mapped to a member of the 

unary predicate *city*. 

 

The fields in the table are mapped to binary predicates. The first 

argument of the predicate is the primary key, while the second 

argument is the data in the relevant field. Thus, in the above 

example, the third field is mapped to the binary predicate 

*population_of*, whose extension is a set of pairs such as 

``'(athens, 1368)'``. 

 

An exception to this general framework is required by the relations in 

the files ``borders.pl`` and ``contains.pl``. These contain facts of the 

following form:: 

 

    'borders(albania,greece).' 

 

    'contains0(africa,central_africa).' 

 

We do not want to form a unary concept out the element in 

the first field of these records, and we want the label of the binary 

relation just to be ``'border'``/``'contain'`` respectively. 

 

In order to drive the extraction process, we use 'relation metadata bundles' 

which are Python dictionaries such as the following:: 

 

  city = {'label': 'city', 

          'closures': [], 

          'schema': ['city', 'country', 'population'], 

          'filename': 'cities.pl'} 

 

According to this, the file ``city['filename']`` contains a list of 

relational tuples (or more accurately, the corresponding strings in 

Prolog form) whose predicate symbol is ``city['label']`` and whose 

relational schema is ``city['schema']``. The notion of a ``closure`` is 

discussed in the next section. 

 

Concepts 

======== 

In order to encapsulate the results of the extraction, a class of 

``Concept`` objects is introduced.  A ``Concept`` object has a number of 

attributes, in particular a ``prefLabel`` and ``extension``, which make 

it easier to inspect the output of the extraction. In addition, the 

``extension`` can be further processed: in the case of the ``'border'`` 

relation, we check that the relation is symmetric, and in the case 

of the ``'contain'`` relation, we carry out the transitive 

closure. The closure properties associated with a concept is 

indicated in the relation metadata, as indicated earlier. 

 

The ``extension`` of a ``Concept`` object is then incorporated into a 

``Valuation`` object. 

 

Persistence 

=========== 

The functions ``val_dump`` and ``val_load`` are provided to allow a 

valuation to be stored in a persistent database and re-loaded, rather 

than having to be re-computed each time. 

 

Individuals and Lexical Items 

============================= 

As well as deriving relations from the Chat-80 data, we also create a 

set of individual constants, one for each entity in the domain. The 

individual constants are string-identical to the entities. For 

example, given a data item such as ``'zloty'``, we add to the valuation 

a pair ``('zloty', 'zloty')``. In order to parse English sentences that 

refer to these entities, we also create a lexical item such as the 

following for each individual constant:: 

 

   PropN[num=sg, sem=<\P.(P zloty)>] -> 'Zloty' 

 

The set of rules is written to the file ``chat_pnames.cfg`` in the 

current directory. 

 

""" 

from __future__ import print_function 

 

import re 

import shelve 

import os 

import sys 

 

from nltk.data import find 

 

 

########################################################################### 

# Chat-80 relation metadata bundles needed to build the valuation 

########################################################################### 

 

borders = {'rel_name': 'borders', 

           'closures': ['symmetric'], 

           'schema': ['region', 'border'], 

           'filename': 'borders.pl'} 

 

contains = {'rel_name': 'contains0', 

            'closures': ['transitive'], 

            'schema': ['region', 'contain'], 

            'filename': 'contain.pl'} 

 

city = {'rel_name': 'city', 

        'closures': [], 

        'schema': ['city', 'country', 'population'], 

        'filename': 'cities.pl'} 

 

country = {'rel_name': 'country', 

           'closures': [], 

           'schema': ['country', 'region', 'latitude', 'longitude', 

                      'area', 'population', 'capital', 'currency'], 

           'filename': 'countries.pl'} 

 

circle_of_lat = {'rel_name': 'circle_of_latitude', 

                 'closures': [], 

                 'schema': ['circle_of_latitude', 'degrees'], 

                 'filename': 'world1.pl'} 

 

circle_of_long = {'rel_name': 'circle_of_longitude', 

                 'closures': [], 

                 'schema': ['circle_of_longitude', 'degrees'], 

                 'filename': 'world1.pl'} 

 

continent = {'rel_name': 'continent', 

             'closures': [], 

             'schema': ['continent'], 

             'filename': 'world1.pl'} 

 

region = {'rel_name': 'in_continent', 

          'closures': [], 

          'schema': ['region', 'continent'], 

          'filename': 'world1.pl'} 

 

ocean = {'rel_name': 'ocean', 

         'closures': [], 

         'schema': ['ocean'], 

         'filename': 'world1.pl'} 

 

sea = {'rel_name': 'sea', 

       'closures': [], 

       'schema': ['sea'], 

       'filename': 'world1.pl'} 

 

 

 

items = ['borders', 'contains', 'city', 'country', 'circle_of_lat', 

         'circle_of_long', 'continent', 'region', 'ocean', 'sea'] 

items = tuple(sorted(items)) 

 

item_metadata = { 

    'borders': borders, 

    'contains': contains, 

    'city': city, 

    'country': country, 

    'circle_of_lat': circle_of_lat, 

    'circle_of_long': circle_of_long, 

    'continent': continent, 

    'region': region, 

    'ocean': ocean, 

    'sea': sea 

    } 

 

rels = item_metadata.values() 

 

not_unary = ['borders.pl', 'contain.pl'] 

 

########################################################################### 

 

class Concept(object): 

    """ 

    A Concept class, loosely based on SKOS 

    (http://www.w3.org/TR/swbp-skos-core-guide/). 

    """ 

    def __init__(self, prefLabel, arity, altLabels=[], closures=[], extension=set()): 

        """ 

        :param prefLabel: the preferred label for the concept 

        :type prefLabel: str 

        :param arity: the arity of the concept 

        :type arity: int 

        @keyword altLabels: other (related) labels 

        :type altLabels: list 

        @keyword closures: closure properties of the extension \ 

            (list items can be ``symmetric``, ``reflexive``, ``transitive``) 

        :type closures: list 

        @keyword extension: the extensional value of the concept 

        :type extension: set 

        """ 

        self.prefLabel = prefLabel 

        self.arity = arity 

        self.altLabels = altLabels 

        self.closures = closures 

        #keep _extension internally as a set 

        self._extension = extension 

        #public access is via a list (for slicing) 

        self.extension = list(extension) 

 

    def __str__(self): 

        #_extension = '' 

        #for element in sorted(self.extension): 

            #if isinstance(element, tuple): 

                #element = '(%s, %s)' % (element) 

            #_extension += element + ', ' 

        #_extension = _extension[:-1] 

 

        return "Label = '%s'\nArity = %s\nExtension = %s" % \ 

               (self.prefLabel, self.arity, self.extension) 

 

    def __repr__(self): 

        return "Concept('%s')" % self.prefLabel 

 

    def augment(self, data): 

        """ 

        Add more data to the ``Concept``'s extension set. 

 

        :param data: a new semantic value 

        :type data: string or pair of strings 

        :rtype: set 

 

        """ 

        self._extension.add(data) 

        self.extension = list(self._extension) 

        return self._extension 

 

 

    def _make_graph(self, s): 

        """ 

        Convert a set of pairs into an adjacency linked list encoding of a graph. 

        """ 

        g = {} 

        for (x, y) in s: 

            if x in g: 

                g[x].append(y) 

            else: 

                g[x] = [y] 

        return g 

 

    def _transclose(self, g): 

        """ 

        Compute the transitive closure of a graph represented as a linked list. 

        """ 

        for x in g: 

            for adjacent in g[x]: 

                # check that adjacent is a key 

                if adjacent in g: 

                    for y in g[adjacent]: 

                        if y not in g[x]: 

                            g[x].append(y) 

        return g 

 

    def _make_pairs(self, g): 

        """ 

        Convert an adjacency linked list back into a set of pairs. 

        """ 

        pairs = [] 

        for node in g: 

            for adjacent in g[node]: 

                pairs.append((node, adjacent)) 

        return set(pairs) 

 

 

    def close(self): 

        """ 

        Close a binary relation in the ``Concept``'s extension set. 

 

        :return: a new extension for the ``Concept`` in which the 

                 relation is closed under a given property 

        """ 

        from nltk.sem import is_rel 

        assert is_rel(self._extension) 

        if 'symmetric' in self.closures: 

            pairs = [] 

            for (x, y) in self._extension: 

                pairs.append((y, x)) 

            sym = set(pairs) 

            self._extension = self._extension.union(sym) 

        if 'transitive' in self.closures: 

            all =  self._make_graph(self._extension) 

            closed =  self._transclose(all) 

            trans = self._make_pairs(closed) 

            #print sorted(trans) 

            self._extension = self._extension.union(trans) 

        self.extension = list(self._extension) 

 

 

def clause2concepts(filename, rel_name, schema, closures=[]): 

    """ 

    Convert a file of Prolog clauses into a list of ``Concept`` objects. 

 

    :param filename: filename containing the relations 

    :type filename: str 

    :param rel_name: name of the relation 

    :type rel_name: str 

    :param schema: the schema used in a set of relational tuples 

    :type schema: list 

    :param closures: closure properties for the extension of the concept 

    :type closures: list 

    :return: a list of ``Concept`` objects 

    :rtype: list 

    """ 

    concepts = [] 

    # position of the subject of a binary relation 

    subj = 0 

    # label of the 'primary key' 

    pkey = schema[0] 

    # fields other than the primary key 

    fields = schema[1:] 

 

    # convert a file into a list of lists 

    records = _str2records(filename, rel_name) 

 

    # add a unary concept corresponding to the set of entities 

    # in the primary key position 

    # relations in 'not_unary' are more like ordinary binary relations 

    if not filename in not_unary: 

        concepts.append(unary_concept(pkey, subj, records)) 

 

    # add a binary concept for each non-key field 

    for field in fields: 

        obj = schema.index(field) 

        concepts.append(binary_concept(field, closures, subj, obj, records)) 

 

    return concepts 

 

def cities2table(filename, rel_name, dbname, verbose=False, setup=False): 

    """ 

    Convert a file of Prolog clauses into a database table. 

 

    This is not generic, since it doesn't allow arbitrary 

    schemas to be set as a parameter. 

 

    Intended usage:: 

 

        cities2table('cities.pl', 'city', 'city.db', verbose=True, setup=True) 

 

    :param filename: filename containing the relations 

    :type filename: str 

    :param rel_name: name of the relation 

    :type rel_name: str 

    :param dbname: filename of persistent store 

    :type schema: str 

    """ 

    import sqlite3 

    records = _str2records(filename, rel_name) 

    connection =  sqlite3.connect(dbname) 

    cur = connection.cursor() 

    if setup: 

        cur.execute('''CREATE TABLE city_table 

        (City text, Country text, Population int)''') 

 

    table_name = "city_table" 

    for t in records: 

        cur.execute('insert into %s values (?,?,?)' % table_name, t) 

        if verbose: 

            print("inserting values into %s: " % table_name, t) 

    connection.commit() 

    if verbose: 

        print("Committing update to %s" % dbname) 

    cur.close() 

 

def sql_query(dbname, query): 

    """ 

    Execute an SQL query over a database. 

    :param dbname: filename of persistent store 

    :type schema: str 

    :param query: SQL query 

    :type rel_name: str 

    """ 

    try: 

        import sqlite3 

        path = find(dbname) 

        connection =  sqlite3.connect(path) 

        # return ASCII strings if possible 

        connection.text_factory = sqlite3.OptimizedUnicode 

        cur = connection.cursor() 

        return cur.execute(query) 

    except ValueError: 

        import warnings 

        warnings.warn("Make sure the database file %s is installed and uncompressed." % dbname) 

        raise 

 

def _str2records(filename, rel): 

    """ 

    Read a file into memory and convert each relation clause into a list. 

    """ 

    recs = [] 

    path = find("corpora/chat80/%s" % filename) 

    for line in path.open(): 

        if line.startswith(rel): 

            line = re.sub(rel+r'\(', '', line) 

            line = re.sub(r'\)\.$', '', line) 

            line = line[:-1] 

            record = line.split(',') 

            recs.append(record) 

    return recs 

 

def unary_concept(label, subj, records): 

    """ 

    Make a unary concept out of the primary key in a record. 

 

    A record is a list of entities in some relation, such as 

    ``['france', 'paris']``, where ``'france'`` is acting as the primary 

    key. 

 

    :param label: the preferred label for the concept 

    :type label: string 

    :param subj: position in the record of the subject of the predicate 

    :type subj: int 

    :param records: a list of records 

    :type records: list of lists 

    :return: ``Concept`` of arity 1 

    :rtype: Concept 

    """ 

    c = Concept(label, arity=1, extension=set()) 

    for record in records: 

        c.augment(record[subj]) 

    return c 

 

def binary_concept(label, closures, subj, obj, records): 

    """ 

    Make a binary concept out of the primary key and another field in a record. 

 

    A record is a list of entities in some relation, such as 

    ``['france', 'paris']``, where ``'france'`` is acting as the primary 

    key, and ``'paris'`` stands in the ``'capital_of'`` relation to 

    ``'france'``. 

 

    More generally, given a record such as ``['a', 'b', 'c']``, where 

    label is bound to ``'B'``, and ``obj`` bound to 1, the derived 

    binary concept will have label ``'B_of'``, and its extension will 

    be a set of pairs such as ``('a', 'b')``. 

 

 

    :param label: the base part of the preferred label for the concept 

    :type label: str 

    :param closures: closure properties for the extension of the concept 

    :type closures: list 

    :param subj: position in the record of the subject of the predicate 

    :type subj: int 

    :param obj: position in the record of the object of the predicate 

    :type obj: int 

    :param records: a list of records 

    :type records: list of lists 

    :return: ``Concept`` of arity 2 

    :rtype: Concept 

    """ 

    if not label == 'border' and not label == 'contain': 

        label = label + '_of' 

    c = Concept(label, arity=2, closures=closures, extension=set()) 

    for record in records: 

        c.augment((record[subj], record[obj])) 

    # close the concept's extension according to the properties in closures 

    c.close() 

    return c 

 

 

def process_bundle(rels): 

    """ 

    Given a list of relation metadata bundles, make a corresponding 

    dictionary of concepts, indexed by the relation name. 

 

    :param rels: bundle of metadata needed for constructing a concept 

    :type rels: list of dict 

    :return: a dictionary of concepts, indexed by the relation name. 

    :rtype: dict 

    """ 

    concepts = {} 

    for rel in rels: 

        rel_name = rel['rel_name'] 

        closures = rel['closures'] 

        schema = rel['schema'] 

        filename = rel['filename'] 

 

        concept_list = clause2concepts(filename, rel_name, schema, closures) 

        for c in concept_list: 

            label = c.prefLabel 

            if (label in concepts): 

                for data in c.extension: 

                    concepts[label].augment(data) 

                concepts[label].close() 

            else: 

                concepts[label] = c 

    return concepts 

 

 

def make_valuation(concepts, read=False, lexicon=False): 

    """ 

    Convert a list of ``Concept`` objects into a list of (label, extension) pairs; 

    optionally create a ``Valuation`` object. 

 

    :param concepts: concepts 

    :type concepts: list(Concept) 

    :param read: if ``True``, ``(symbol, set)`` pairs are read into a ``Valuation`` 

    :type read: bool 

    :rtype: list or Valuation 

    """ 

    vals = [] 

 

    for c in concepts: 

        vals.append((c.prefLabel, c.extension)) 

    if lexicon: read = True 

    if read: 

        from nltk.sem import Valuation 

        val = Valuation({}) 

        val.update(vals) 

        # add labels for individuals 

        val = label_indivs(val, lexicon=lexicon) 

        return val 

    else: return vals 

 

 

def val_dump(rels, db): 

    """ 

    Make a ``Valuation`` from a list of relation metadata bundles and dump to 

    persistent database. 

 

    :param rels: bundle of metadata needed for constructing a concept 

    :type rels: list of dict 

    :param db: name of file to which data is written. 

               The suffix '.db' will be automatically appended. 

    :type db: string 

    """ 

    concepts = process_bundle(rels).values() 

    valuation = make_valuation(concepts, read=True) 

    db_out = shelve.open(db, 'n') 

 

    db_out.update(valuation) 

 

    db_out.close() 

 

 

def val_load(db): 

    """ 

    Load a ``Valuation`` from a persistent database. 

 

    :param db: name of file from which data is read. 

               The suffix '.db' should be omitted from the name. 

    :type db: string 

    """ 

    dbname = db+".db" 

 

    if not os.access(dbname, os.R_OK): 

        sys.exit("Cannot read file: %s" % dbname) 

    else: 

        db_in = shelve.open(db) 

        from nltk.sem import Valuation 

        val = Valuation(db_in) 

#        val.read(db_in.items()) 

        return val 

 

 

#def alpha(str): 

    #""" 

    #Utility to filter out non-alphabetic constants. 

 

    #:param str: candidate constant 

    #:type str: string 

    #:rtype: bool 

    #""" 

    #try: 

        #int(str) 

        #return False 

    #except ValueError: 

        ## some unknown values in records are labeled '?' 

        #if not str == '?': 

            #return True 

 

 

def label_indivs(valuation, lexicon=False): 

    """ 

    Assign individual constants to the individuals in the domain of a ``Valuation``. 

 

    Given a valuation with an entry of the form ``{'rel': {'a': True}}``, 

    add a new entry ``{'a': 'a'}``. 

 

    :type valuation: Valuation 

    :rtype: Valuation 

    """ 

    # collect all the individuals into a domain 

    domain = valuation.domain 

    # convert the domain into a sorted list of alphabetic terms 

    # use the same string as a label 

    pairs = [(e, e) for e in domain] 

    if lexicon: 

        lex = make_lex(domain) 

        open("chat_pnames.cfg", mode='w').writelines(lex) 

    # read the pairs into the valuation 

    valuation.update(pairs) 

    return valuation 

 

def make_lex(symbols): 

    """ 

    Create lexical CFG rules for each individual symbol. 

 

    Given a valuation with an entry of the form ``{'zloty': 'zloty'}``, 

    create a lexical rule for the proper name 'Zloty'. 

 

    :param symbols: a list of individual constants in the semantic representation 

    :type symbols: sequence 

    :rtype: list 

    """ 

    lex = [] 

    header = """ 

################################################################## 

# Lexical rules automatically generated by running 'chat80.py -x'. 

################################################################## 

 

""" 

    lex.append(header) 

    template = "PropN[num=sg, sem=<\P.(P %s)>] -> '%s'\n" 

 

    for s in symbols: 

        parts = s.split('_') 

        caps = [p.capitalize() for p in parts] 

        pname = ('_').join(caps) 

        rule = template % (s, pname) 

        lex.append(rule) 

    return lex 

 

 

########################################################################### 

# Interface function to emulate other corpus readers 

########################################################################### 

 

def concepts(items = items): 

    """ 

    Build a list of concepts corresponding to the relation names in ``items``. 

 

    :param items: names of the Chat-80 relations to extract 

    :type items: list of strings 

    :return: the ``Concept`` objects which are extracted from the relations 

    :rtype: list 

    """ 

    if isinstance(items, str): items = (items,) 

 

    rels = [item_metadata[r] for r in items] 

 

    concept_map = process_bundle(rels) 

    return concept_map.values() 

 

 

 

 

########################################################################### 

 

 

def main(): 

    import sys 

    from optparse import OptionParser 

    description = \ 

    """ 

Extract data from the Chat-80 Prolog files and convert them into a 

Valuation object for use in the NLTK semantics package. 

    """ 

 

    opts = OptionParser(description=description) 

    opts.set_defaults(verbose=True, lex=False, vocab=False) 

    opts.add_option("-s", "--store", dest="outdb", 

                    help="store a valuation in DB", metavar="DB") 

    opts.add_option("-l", "--load", dest="indb", 

                    help="load a stored valuation from DB", metavar="DB") 

    opts.add_option("-c", "--concepts", action="store_true", 

                    help="print concepts instead of a valuation") 

    opts.add_option("-r", "--relation", dest="label", 

                    help="print concept with label REL (check possible labels with '-v' option)", metavar="REL") 

    opts.add_option("-q", "--quiet", action="store_false", dest="verbose", 

                    help="don't print out progress info") 

    opts.add_option("-x", "--lex", action="store_true", dest="lex", 

                    help="write a file of lexical entries for country names, then exit") 

    opts.add_option("-v", "--vocab", action="store_true", dest="vocab", 

                        help="print out the vocabulary of concept labels and their arity, then exit") 

 

    (options, args) = opts.parse_args() 

    if options.outdb and options.indb: 

        opts.error("Options --store and --load are mutually exclusive") 

 

 

    if options.outdb: 

        # write the valuation to a persistent database 

        if options.verbose: 

            outdb = options.outdb+".db" 

            print("Dumping a valuation to %s" % outdb) 

        val_dump(rels, options.outdb) 

        sys.exit(0) 

    else: 

        # try to read in a valuation from a database 

        if options.indb is not None: 

            dbname = options.indb+".db" 

            if not os.access(dbname, os.R_OK): 

                sys.exit("Cannot read file: %s" % dbname) 

            else: 

                valuation = val_load(options.indb) 

        # we need to create the valuation from scratch 

        else: 

            # build some concepts 

            concept_map = process_bundle(rels) 

            concepts = concept_map.values() 

            # just print out the vocabulary 

            if options.vocab: 

                items = sorted([(c.arity, c.prefLabel) for c in concepts]) 

                for (arity, label) in items: 

                    print(label, arity) 

                sys.exit(0) 

            # show all the concepts 

            if options.concepts: 

                for c in concepts: 

                    print(c) 

                    print() 

            if options.label: 

                print(concept_map[options.label]) 

                sys.exit(0) 

            else: 

                # turn the concepts into a Valuation 

                if options.lex: 

                    if options.verbose: 

                        print("Writing out lexical rules") 

                    make_valuation(concepts, lexicon=True) 

                else: 

                    valuation = make_valuation(concepts, read=True) 

                    print(valuation) 

 

 

def sql_demo(): 

    """ 

    Print out every row from the 'city.db' database. 

    """ 

    print() 

    print("Using SQL to extract rows from 'city.db' RDB.") 

    for row in sql_query('corpora/city_database/city.db', "SELECT * FROM city_table"): 

        print(row) 

 

 

if __name__ == '__main__': 

    main() 

    sql_demo()