{
"metadata": {
"name": "",
"signature": "sha256:7ec7a2ebdd08378856d0be2ae7df1f98d910a9349ad8870f6db8586bf464fd8f"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"The Big Bang Theory TVD Plugin"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Install"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following command will install **The Big Bang Theory TVD plugin** (and **TVD** if it is missing)\n",
"```bash\n",
"pip install TVDTheBigBangTheory\n",
"```"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Download all resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following command will download all resources for **The Big Bang Theory** into `/tmp/tvd_corpus` directory.\n",
"```bash\n",
"python -m tvd.create www /tmp/tvd_corpus/ TheBigBangTheory\n",
"```"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Available resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Initialize The Big Bang Theory TVD plugin"
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"from tvd import TheBigBangTheory\n",
"tbbt = TheBigBangTheory('/tmp/tvd_corpus')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"IN CASE YOU USE 'speaker' RESOURCES, PLEASE CONSIDER CITING:\n",
"@inproceedings{Tapaswi2012\n",
" title = {{``Knock! Knock! Who is it?'' Probabilistic Person Identification in TV Series}},\n",
" author = {Makarand Tapaswi and Martin B\\\"{a}uml and Rainer Stiefelhagen},\n",
" booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},\n",
" year = {2012},\n",
" month = {June},\n",
"}\n",
"\n",
"IN CASE YOU USE 'outline' RESOURCES, PLEASE CONSIDER CITING:\n",
"@misc{the-big-bang-theory.com,\n",
" title = {{The Big Bang Theory Wiki}},\n",
" howpublished = \\url{http://wiki.the-big-bang-theory.com/}\n",
"}\n",
"\n",
"IN CASE YOU USE 'manual_transcript' RESOURCES, PLEASE CONSIDER CITING:\n",
"@misc{bigbangtrans,\n",
" title = {{big bang theory transcripts}},\n",
" howpublished = \\url{http://bigbangtrans.wordpress.com/}\n",
"}\n",
"\n"
]
}
],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Get first episode"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"episode = tbbt.episodes[0]\n",
"episode"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 2,
"text": [
"Episode(series='TheBigBangTheory', season=1, episode=1)"
]
}
],
"prompt_number": 2
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Outlines"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Source: http://wiki.the-big-bang-theory.com/ \n",
"Provides: \n",
" \n",
" - segmentation into scenes\n",
" - scene location\n",
" - scene summary\n",
"\n",
"Does not provide timestamps.\n",
"\n",
"```\n",
"I. Hallway outside High-IQ Sperm Bank\n",
"Sheldon tells his \"good idea for a T-shirt\".\n",
"II. High-IQ Sperm Bank\n",
"Sheldon and Leonard consider donating sperm, but back out.\n",
"III. Stairs\n",
"Sheldon and Leonard head back to the apartment, discussing the height of stair steps.\n",
"...\n",
"```"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from tvd import AnnotationGraph\n",
"outline = AnnotationGraph.load(tbbt.path_to_resource(episode, 'outline'))\n",
"outline"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"svg": [
"\n"
],
"text": [
""
]
}
],
"prompt_number": 3
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Manual transcript"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Source: http://bigbangtrans.wordpress.com/ \n",
"Provides:\n",
"\n",
" - scene location\n",
" - speaker label\n",
" - speech content\n",
"\n",
"Does not provide timestamps.\n",
"\n",
"```\n",
"Scene: A corridor at a sperm bank.\n",
"Sheldon: So if a photon is directed through a plane with two slits in it and either slit is observed it will not go through both slits. If it\u2019s unobserved it will, however, if it\u2019s observed after it\u2019s left the plane but before it hits its target, it will not have gone through both slits.\n",
"Leonard: Agreed, what\u2019s your point?\n",
"Sheldon: There\u2019s no point, I just think it\u2019s a good idea for a tee-shirt.\n",
"Leonard: Excuse me?\n",
"Receptionist: Hang on.\n",
"...\n",
"```"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"manual_transcript = AnnotationGraph.load(tbbt.path_to_resource(episode, 'manual_transcript'))\n",
"manual_transcript"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 4,
"svg": [
"\n"
],
"text": [
""
]
}
],
"prompt_number": 4
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Speaker labels"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Source: [Makarand Tapaswi](https://cvhci.anthropomatik.kit.edu/~mtapaswi/projects/personid.html) \n",
"Provides:\n",
"\n",
" - speaker label with manual timestamps (five main characters + *other*)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"speaker = AnnotationGraph.load(tbbt.path_to_resource(episode, 'speaker'))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# show only a subgraph (between t=30s and t=60s)\n",
"from tvd import TAnchored\n",
"speaker.crop(TAnchored(30), TAnchored(60))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"svg": [
"\n"
],
"text": [
""
]
}
],
"prompt_number": 6
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Force-aligned manual transcripts"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Source: [LIMSI](http://www.limsi.fr) \n",
"Provides:\n",
"\n",
" - word-level timestamps (start & end time)\n"
]
}
],
"metadata": {}
}
]
}