{ "cells": [ { "metadata": { "toc": true }, "cell_type": "markdown", "source": "

Table of Contents

\n
" }, { "metadata": { "hide_input": false, "run_control": { "marked": false }, "slideshow": { "slide_type": "subslide" } }, "cell_type": "markdown", "source": "## Import data from excel files" }, { "metadata": { "hide_input": false, "run_control": { "marked": false }, "slideshow": { "slide_type": "subslide" } }, "cell_type": "markdown", "source": "### Using XLConnect package" }, { "metadata": { "hide_input": false, "run_control": { "frozen": false, "marked": false, "read_only": false }, "slideshow": { "slide_type": "fragment" }, "trusted": false }, "cell_type": "code", "source": "#install.packages(\"XLConnect\", repos = \"http://cran.us.r-project.org\")\nlibrary(XLConnect) #lib.loc=\"C:/Users/Suman/Documents/R/win-library/3.4\")\n\n \n# Download an Excel file from the internet\n# url <- \"http://miraisolutions.files.wordpress.com/\"\n# path <- \"2013/01/\"\n# file.name <- \"example.xlsx\"\n# download.file(paste(url, path, file.name, sep = \"\"), file.name)\n \n# Load the workbook\nwb <- XLConnect::loadWorkbook(\"data/latitude.xlsx\")\n \n# Read data from a target section of the worksheet and print\nmy.data <- readWorksheet(wb, sheet = \"1700\", startRow = 10, startCol = 1, \n endRow = 42, endCol = 2)\nprint(my.data)", "execution_count": 19, "outputs": [ { "name": "stdout", "output_type": "stream", "text": " Antigua.and.Barbuda X17.072\n1 Argentina -36.67600\n2 Armenia 40.25400\n3 Aruba 12.51300\n4 Australia -32.21900\n5 Austria 48.23100\n6 Azerbaijan 40.35200\n7 Bahamas 24.70000\n8 Bahrain 26.02400\n9 Bangladesh 23.88000\n10 Barbados 13.17900\n11 Belarus 53.54709\n12 Belgium 50.83700\n13 Belize 17.84300\n14 Benin 6.36400\n15 Bermuda 32.21700\n16 Bhutan 27.47900\n17 Bolivia -15.19000\n18 Bosnia and Herzegovina 44.17501\n19 Botswana -21.53600\n20 Brazil -19.55700\n21 British Virgin Islands 18.50000\n22 Brunei 4.50100\n23 Bulgaria 42.07300\n24 Burkina Faso 12.04900\n25 Burundi -3.36500\n26 Cambodia 12.02600\n27 Cameroon 10.73000\n28 Canada 43.72700\n29 Cape Verde 15.09100\n30 Cayman Islands 19.31900\n31 Central African Rep. 4.33100\n32 Chad 10.37700\n" } ] }, { "metadata": { "hide_input": false, "run_control": { "marked": false }, "slideshow": { "slide_type": "subslide" } }, "cell_type": "markdown", "source": "### Using xlsx Package" }, { "metadata": { "hide_input": false, "run_control": { "frozen": false, "marked": false, "read_only": false }, "scrolled": true, "slideshow": { "slide_type": "fragment" }, "trusted": false }, "cell_type": "code", "source": "#install.packages(\"xlsx\", repos = \"http://cran.us.r-project.org\")\n# Use xlsx package\nlibrary(\"xlsx\")\nmy_data <- read.xlsx(\"data/cities.xlsx\", sheetName = \"year_2000\")\nmy_data\nmy_data <- read.xlsx2(\"data/cities.xlsx\", sheetName = \"year_1990\") # Prefer the read.xlsx2() over read.xlsx(), it’s significantly faster for large dataset.\nmy_data", "execution_count": 20, "outputs": [ { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\n
CapitalPopulation
New York 17800000
Berlin 3382169
Madrid 2938723
Stockholm 1942362
NA NA
NA NA
NA NA
NA NA
NA NA
\n", "text/latex": "\\begin{tabular}{r|ll}\n Capital & Population\\\\\n\\hline\n\t New York & 17800000 \\\\\n\t Berlin & 3382169 \\\\\n\t Madrid & 2938723 \\\\\n\t Stockholm & 1942362 \\\\\n\t NA & NA \\\\\n\t NA & NA \\\\\n\t NA & NA \\\\\n\t NA & NA \\\\\n\t NA & NA \\\\\n\\end{tabular}\n", "text/markdown": "\nCapital | Population | \n|---|---|---|---|---|---|---|---|---|\n| New York | 17800000 | \n| Berlin | 3382169 | \n| Madrid | 2938723 | \n| Stockholm | 1942362 | \n| NA | NA | \n| NA | NA | \n| NA | NA | \n| NA | NA | \n| NA | NA | \n\n\n", "text/plain": " Capital Population\n1 New York 17800000 \n2 Berlin 3382169 \n3 Madrid 2938723 \n4 Stockholm 1942362 \n5 NA NA \n6 NA NA \n7 NA NA \n8 NA NA \n9 NA NA " }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\n
CapitalPopulation
New York 16044000
Berlin 3433695
Madrid 3010492
Stockholm1683713
\n", "text/latex": "\\begin{tabular}{r|ll}\n Capital & Population\\\\\n\\hline\n\t New York & 16044000 \\\\\n\t Berlin & 3433695 \\\\\n\t Madrid & 3010492 \\\\\n\t Stockholm & 1683713 \\\\\n\\end{tabular}\n", "text/markdown": "\nCapital | Population | \n|---|---|---|---|\n| New York | 16044000 | \n| Berlin | 3433695 | \n| Madrid | 3010492 | \n| Stockholm | 1683713 | \n\n\n", "text/plain": " Capital Population\n1 New York 16044000 \n2 Berlin 3433695 \n3 Madrid 3010492 \n4 Stockholm 1683713 " }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "hide_input": false, "run_control": { "marked": false }, "slideshow": { "slide_type": "subslide" } }, "cell_type": "markdown", "source": "### Using Readxl Package" }, { "metadata": { "hide_input": false, "run_control": { "frozen": false, "marked": false, "read_only": false }, "scrolled": false, "slideshow": { "slide_type": "fragment" }, "trusted": true }, "cell_type": "code", "source": "# Use readxl package to read xls|xlsxl\n#install.packages(\"readxl\") \nlibrary(readxl)\nexcel_sheets(\"data/cities.xlsx\") \nmy_data<-read_excel(\"data/cities.xlsx\")\nmy_data\nread_excel(\"data/cities.xlsx\", sheet = 2) \nread_excel(\"data/cities.xlsx\", sheet = \"year_2000\")\n# col_names = FALSE: R assigns names itself \n# col_names = character vector: manually specify \n", "execution_count": 1, "outputs": [ { "output_type": "display_data", "data": { "text/plain": "[1] \"year_1990\" \"year_2000\"", "text/latex": "\\begin{enumerate*}\n\\item 'year\\_1990'\n\\item 'year\\_2000'\n\\end{enumerate*}\n", "text/markdown": "1. 'year_1990'\n2. 'year_2000'\n\n\n", "text/html": "
    \n\t
  1. 'year_1990'
  2. \n\t
  3. 'year_2000'
  4. \n
\n" }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": " Capital Population\n1 New York 16044000 \n2 Berlin 3433695 \n3 Madrid 3010492 \n4 Stockholm 1683713 ", "text/latex": "\\begin{tabular}{r|ll}\n Capital & Population\\\\\n\\hline\n\t New York & 16044000 \\\\\n\t Berlin & 3433695 \\\\\n\t Madrid & 3010492 \\\\\n\t Stockholm & 1683713 \\\\\n\\end{tabular}\n", "text/markdown": "\n| Capital | Population |\n|---|---|\n| New York | 16044000 |\n| Berlin | 3433695 |\n| Madrid | 3010492 |\n| Stockholm | 1683713 |\n\n", "text/html": "\n\n\n\t\n\t\n\t\n\t\n\n
CapitalPopulation
New York 16044000
Berlin 3433695
Madrid 3010492
Stockholm 1683713
\n" }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": " Capital Population\n1 New York 17800000 \n2 Berlin 3382169 \n3 Madrid 2938723 \n4 Stockholm 1942362 ", "text/latex": "\\begin{tabular}{r|ll}\n Capital & Population\\\\\n\\hline\n\t New York & 17800000 \\\\\n\t Berlin & 3382169 \\\\\n\t Madrid & 2938723 \\\\\n\t Stockholm & 1942362 \\\\\n\\end{tabular}\n", "text/markdown": "\n| Capital | Population |\n|---|---|\n| New York | 17800000 |\n| Berlin | 3382169 |\n| Madrid | 2938723 |\n| Stockholm | 1942362 |\n\n", "text/html": "\n\n\n\t\n\t\n\t\n\t\n\n
CapitalPopulation
New York 17800000
Berlin 3382169
Madrid 2938723
Stockholm 1942362
\n" }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": " Capital Population\n1 New York 17800000 \n2 Berlin 3382169 \n3 Madrid 2938723 \n4 Stockholm 1942362 ", "text/latex": "\\begin{tabular}{r|ll}\n Capital & Population\\\\\n\\hline\n\t New York & 17800000 \\\\\n\t Berlin & 3382169 \\\\\n\t Madrid & 2938723 \\\\\n\t Stockholm & 1942362 \\\\\n\\end{tabular}\n", "text/markdown": "\n| Capital | Population |\n|---|---|\n| New York | 17800000 |\n| Berlin | 3382169 |\n| Madrid | 2938723 |\n| Stockholm | 1942362 |\n\n", "text/html": "\n\n\n\t\n\t\n\t\n\t\n\n
CapitalPopulation
New York 17800000
Berlin 3382169
Madrid 2938723
Stockholm 1942362
\n" }, "metadata": {} } ] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "## Importing XML data into R " }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": true }, "cell_type": "code", "source": "# working with xml file from url link\nlibrary(RCurl)\nlibrary(XML, lib.loc=\"~/R/win-library/3.4\")\n# library(RCurl, lib.loc=\"~/R/win-library/3.4\")\nfileURL <- \"https://www.w3schools.com/xml/simple.xml\"\nxData <- getURL(fileURL)\ndoc <- xmlParse(xData)\nrootNode <- xmlRoot(doc)\nxmldataframe1 <- xmlToDataFrame(xData) \nxmldataframe1\n# working with local file\nresult <- xmlParse(\"data/books.xml\")\nprint(result)\nrootNode <- xmlRoot(result)\nrootSize <- xmlSize(rootNode)\nprint(rootSize)\nprint(rootNode[1])\nxmldataframe <- xmlToDataFrame(\"data/books.xml\")\nprint(xmldataframe)\nhead(xmldataframe)", "execution_count": 1, "outputs": [ { "output_type": "stream", "text": "Loading required package: bitops\n", "name": "stderr" }, { "output_type": "error", "ename": "ERROR", "evalue": "Error in library(XML, lib.loc = \"~/R/win-library/3.4\"): no library trees found in 'lib.loc'\n", "traceback": [ "Error in library(XML, lib.loc = \"~/R/win-library/3.4\"): no library trees found in 'lib.loc'\nTraceback:\n", "1. library(XML, lib.loc = \"~/R/win-library/3.4\")", "2. stop(txt, domain = NA)" ] } ] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "## Import JSON data into R" }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "#install.packages(\"rjson\", repos = \"http://cran.us.r-project.org\")\n# Load the package required to read JSON files.\nlibrary(\"rjson\", lib.loc=\"~/R/win-library/3.4\")\n\n# Give the input file name to the function.\nresult <- fromJSON(file = \"data/JSN.json\")\n\n# Print the result.\nprint(result)\nas.data.frame(result)", "execution_count": 23, "outputs": [ { "name": "stdout", "output_type": "stream", "text": "$ID\n[1] \"1\" \"2\" \"3\" \"4\" \"5\" \"6\" \"7\" \"8\"\n\n$Name\n[1] \"Rick\" \"Dan\" \"Michelle\" \"Ryan\" \"Gary\" \"Nina\" \"Simon\" \n[8] \"Guru\" \n\n$Salary\n[1] \"623.3\" \"515.2\" \"611\" \"729\" \"843.25\" \"578\" \"632.8\" \"722.5\" \n\n$StartDate\n[1] \"1/1/2012\" \"9/23/2013\" \"11/15/2014\" \"5/11/2014\" \"3/27/2015\" \n[6] \"5/21/2013\" \"7/30/2013\" \"6/17/2014\" \n\n$Dept\n[1] \"IT\" \"Operations\" \"IT\" \"HR\" \"Finance\" \n[6] \"IT\" \"Operations\" \"Finance\" \n\n" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\n
IDNameSalaryStartDateDept
1 Rick 623.3 1/1/2012 IT
2 Dan 515.2 9/23/2013 Operations
3 Michelle 611 11/15/2014IT
4 Ryan 729 5/11/2014 HR
5 Gary 843.25 3/27/2015 Finance
6 Nina 578 5/21/2013 IT
7 Simon 632.8 7/30/2013 Operations
8 Guru 722.5 6/17/2014 Finance
\n", "text/latex": "\\begin{tabular}{r|lllll}\n ID & Name & Salary & StartDate & Dept\\\\\n\\hline\n\t 1 & Rick & 623.3 & 1/1/2012 & IT \\\\\n\t 2 & Dan & 515.2 & 9/23/2013 & Operations\\\\\n\t 3 & Michelle & 611 & 11/15/2014 & IT \\\\\n\t 4 & Ryan & 729 & 5/11/2014 & HR \\\\\n\t 5 & Gary & 843.25 & 3/27/2015 & Finance \\\\\n\t 6 & Nina & 578 & 5/21/2013 & IT \\\\\n\t 7 & Simon & 632.8 & 7/30/2013 & Operations\\\\\n\t 8 & Guru & 722.5 & 6/17/2014 & Finance \\\\\n\\end{tabular}\n", "text/markdown": "\nID | Name | Salary | StartDate | Dept | \n|---|---|---|---|---|---|---|---|\n| 1 | Rick | 623.3 | 1/1/2012 | IT | \n| 2 | Dan | 515.2 | 9/23/2013 | Operations | \n| 3 | Michelle | 611 | 11/15/2014 | IT | \n| 4 | Ryan | 729 | 5/11/2014 | HR | \n| 5 | Gary | 843.25 | 3/27/2015 | Finance | \n| 6 | Nina | 578 | 5/21/2013 | IT | \n| 7 | Simon | 632.8 | 7/30/2013 | Operations | \n| 8 | Guru | 722.5 | 6/17/2014 | Finance | \n\n\n", "text/plain": " ID Name Salary StartDate Dept \n1 1 Rick 623.3 1/1/2012 IT \n2 2 Dan 515.2 9/23/2013 Operations\n3 3 Michelle 611 11/15/2014 IT \n4 4 Ryan 729 5/11/2014 HR \n5 5 Gary 843.25 3/27/2015 Finance \n6 6 Nina 578 5/21/2013 IT \n7 7 Simon 632.8 7/30/2013 Operations\n8 8 Guru 722.5 6/17/2014 Finance " }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "hide_input": false, "run_control": { "marked": false }, "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "## Importing Data From HTML Tables Into R " }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "### Using httr packages" }, { "metadata": { "run_control": { "frozen": false, "marked": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "url <- \"https://en.wikipedia.org/wiki/Nandi_Award_for_Best_Actress\"\n# install.packages(\"httr\", repos = \"http://cran.us.r-project.org\")\nlibrary(httr)\nlibrary(XML) # for readHTMLTable \nurldata <- GET(url)\ndata <- readHTMLTable(rawToChar(urldata$content),stringsAsFactors = FALSE, as.data.frame = TRUE)\nhead(data)", "execution_count": 13, "outputs": [ { "data": { "text/html": "$`NULL` = \n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\n
V1V2V3V4
2016 Ritu Varma Pelli Choopulu NA
2015 Anushka Shetty Size Zero NA
2014 Anjali Geethanjali NA
2013 Anjali Patil Naa Bangaaru Talli NA
2012 Samantha Ruth Prabhu Yeto Vellipoyindhi Manasu NA
2011 Nayantara Sri Rama Rajyam NA
2010 Nithya Menen[5] Ala Modalaindi NA
2009 Thirtha[6] Sontha Ooru NA
2008 Swathi Ashta Chamma NA
2007 Charmy Kaur Mantra NA
2006 Nandita Das Kamli NA
2005 Trisha Nuvvostanante Nenoddantana NA
2004 Kamalinee Mukherjee Anand NA
2003 Bhoomika Chawla Missamma NA
2002 Kalyani Avunu Valliddaru Ista Paddaru NA
2001 Laya[7] Preminchu NA
2000 Laya[8] Manoharam NA
1999 Maheswari Nee Kosam NA
1998 Ramya Krishnan Kante Koothurne Kanu NA
1998 Soundarya Antahpuram NA
1997 Vijayashanti Osey Ramulamma NA
1996 Soundarya Pavithra Bandham NA
1995 Aamani Subha Sankalpam NA
1994 Soundarya Ammoru NA
1993 Aamani Mr. Pellam NA
1992 Meena Rajeshwari Kalyanam NA
1991 Sridevi Kshana Kshanam NA
1990 VijayashantiMeena KarthavyamSeetharamaiah Gari Manavaralu
1989 Vijayashanti Bharatha Naari NA
1988 Bhanupriya Swarna Kamalam NA
1987 Sumalatha Sruthilayalu NA
1986 Lakshmi Sravana Megalu NA
1985 Vijayashanti Pratighatana NA
1984 Suhasini Swathi NA
1983 Jayasudha Dharmaatmudu NA
1982 Jayasudha Meghasandesam NA
1981 Jayasudha Premabhishekam NA
1980 Sakuntala Kukka NA
1979 Jayasudha Idi Katha Kaadu NA
1978 Roopa Naalaaga Endaro NA
1977 Lakshmi[1] Pantulamma NA
1976 Jayaprada[1] Anthuleni Kadha NA
1975 Jayasudha[1] Jyothi NA
\n", "text/latex": "\\textbf{\\$`NULL`} = \\begin{tabular}{r|llll}\n V1 & V2 & V3 & V4\\\\\n\\hline\n\t 2016 & Ritu Varma & Pelli Choopulu & NA \\\\\n\t 2015 & Anushka Shetty & Size Zero & NA \\\\\n\t 2014 & Anjali & Geethanjali & NA \\\\\n\t 2013 & Anjali Patil & Naa Bangaaru Talli & NA \\\\\n\t 2012 & Samantha Ruth Prabhu & Yeto Vellipoyindhi Manasu & NA \\\\\n\t 2011 & Nayantara & Sri Rama Rajyam & NA \\\\\n\t 2010 & Nithya Menen{[}5{]} & Ala Modalaindi & NA \\\\\n\t 2009 & Thirtha{[}6{]} & Sontha Ooru & NA \\\\\n\t 2008 & Swathi & Ashta Chamma & NA \\\\\n\t 2007 & Charmy Kaur & Mantra & NA \\\\\n\t 2006 & Nandita Das & Kamli & NA \\\\\n\t 2005 & Trisha & Nuvvostanante Nenoddantana & NA \\\\\n\t 2004 & Kamalinee Mukherjee & Anand & NA \\\\\n\t 2003 & Bhoomika Chawla & Missamma & NA \\\\\n\t 2002 & Kalyani & Avunu Valliddaru Ista Paddaru & NA \\\\\n\t 2001 & Laya{[}7{]} & Preminchu & NA \\\\\n\t 2000 & Laya{[}8{]} & Manoharam & NA \\\\\n\t 1999 & Maheswari & Nee Kosam & NA \\\\\n\t 1998 & Ramya Krishnan & Kante Koothurne Kanu & NA \\\\\n\t 1998 & Soundarya & Antahpuram & NA \\\\\n\t 1997 & Vijayashanti & Osey Ramulamma & NA \\\\\n\t 1996 & Soundarya & Pavithra Bandham & NA \\\\\n\t 1995 & Aamani & Subha Sankalpam & NA \\\\\n\t 1994 & Soundarya & Ammoru & NA \\\\\n\t 1993 & Aamani & Mr. Pellam & NA \\\\\n\t 1992 & Meena & Rajeshwari Kalyanam & NA \\\\\n\t 1991 & Sridevi & Kshana Kshanam & NA \\\\\n\t 1990 & VijayashantiMeena & KarthavyamSeetharamaiah Gari Manavaralu & \\\\\n\t 1989 & Vijayashanti & Bharatha Naari & NA \\\\\n\t 1988 & Bhanupriya & Swarna Kamalam & NA \\\\\n\t 1987 & Sumalatha & Sruthilayalu & NA \\\\\n\t 1986 & Lakshmi & Sravana Megalu & NA \\\\\n\t 1985 & Vijayashanti & Pratighatana & NA \\\\\n\t 1984 & Suhasini & Swathi & NA \\\\\n\t 1983 & Jayasudha & Dharmaatmudu & NA \\\\\n\t 1982 & Jayasudha & Meghasandesam & NA \\\\\n\t 1981 & Jayasudha & Premabhishekam & NA \\\\\n\t 1980 & Sakuntala & Kukka & NA \\\\\n\t 1979 & Jayasudha & Idi Katha Kaadu & NA \\\\\n\t 1978 & Roopa & Naalaaga Endaro & NA \\\\\n\t 1977 & Lakshmi{[}1{]} & Pantulamma & NA \\\\\n\t 1976 & Jayaprada{[}1{]} & Anthuleni Kadha & NA \\\\\n\t 1975 & Jayasudha{[}1{]} & Jyothi & NA \\\\\n\\end{tabular}\n", "text/markdown": "**$`NULL`** = \nV1 | V2 | V3 | V4 | \n|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n| 2016 | Ritu Varma | Pelli Choopulu | NA | \n| 2015 | Anushka Shetty | Size Zero | NA | \n| 2014 | Anjali | Geethanjali | NA | \n| 2013 | Anjali Patil | Naa Bangaaru Talli | NA | \n| 2012 | Samantha Ruth Prabhu | Yeto Vellipoyindhi Manasu | NA | \n| 2011 | Nayantara | Sri Rama Rajyam | NA | \n| 2010 | Nithya Menen[5] | Ala Modalaindi | NA | \n| 2009 | Thirtha[6] | Sontha Ooru | NA | \n| 2008 | Swathi | Ashta Chamma | NA | \n| 2007 | Charmy Kaur | Mantra | NA | \n| 2006 | Nandita Das | Kamli | NA | \n| 2005 | Trisha | Nuvvostanante Nenoddantana | NA | \n| 2004 | Kamalinee Mukherjee | Anand | NA | \n| 2003 | Bhoomika Chawla | Missamma | NA | \n| 2002 | Kalyani | Avunu Valliddaru Ista Paddaru | NA | \n| 2001 | Laya[7] | Preminchu | NA | \n| 2000 | Laya[8] | Manoharam | NA | \n| 1999 | Maheswari | Nee Kosam | NA | \n| 1998 | Ramya Krishnan | Kante Koothurne Kanu | NA | \n| 1998 | Soundarya | Antahpuram | NA | \n| 1997 | Vijayashanti | Osey Ramulamma | NA | \n| 1996 | Soundarya | Pavithra Bandham | NA | \n| 1995 | Aamani | Subha Sankalpam | NA | \n| 1994 | Soundarya | Ammoru | NA | \n| 1993 | Aamani | Mr. Pellam | NA | \n| 1992 | Meena | Rajeshwari Kalyanam | NA | \n| 1991 | Sridevi | Kshana Kshanam | NA | \n| 1990 | VijayashantiMeena | KarthavyamSeetharamaiah Gari Manavaralu | | \n| 1989 | Vijayashanti | Bharatha Naari | NA | \n| 1988 | Bhanupriya | Swarna Kamalam | NA | \n| 1987 | Sumalatha | Sruthilayalu | NA | \n| 1986 | Lakshmi | Sravana Megalu | NA | \n| 1985 | Vijayashanti | Pratighatana | NA | \n| 1984 | Suhasini | Swathi | NA | \n| 1983 | Jayasudha | Dharmaatmudu | NA | \n| 1982 | Jayasudha | Meghasandesam | NA | \n| 1981 | Jayasudha | Premabhishekam | NA | \n| 1980 | Sakuntala | Kukka | NA | \n| 1979 | Jayasudha | Idi Katha Kaadu | NA | \n| 1978 | Roopa | Naalaaga Endaro | NA | \n| 1977 | Lakshmi[1] | Pantulamma | NA | \n| 1976 | Jayaprada[1] | Anthuleni Kadha | NA | \n| 1975 | Jayasudha[1] | Jyothi | NA | \n\n\n", "text/plain": "$`NULL`\n V1 V2 V3 V4\n1 2016 Ritu Varma Pelli Choopulu \n2 2015 Anushka Shetty Size Zero \n3 2014 Anjali Geethanjali \n4 2013 Anjali Patil Naa Bangaaru Talli \n5 2012 Samantha Ruth Prabhu Yeto Vellipoyindhi Manasu \n6 2011 Nayantara Sri Rama Rajyam \n7 2010 Nithya Menen[5] Ala Modalaindi \n8 2009 Thirtha[6] Sontha Ooru \n9 2008 Swathi Ashta Chamma \n10 2007 Charmy Kaur Mantra \n11 2006 Nandita Das Kamli \n12 2005 Trisha Nuvvostanante Nenoddantana \n13 2004 Kamalinee Mukherjee Anand \n14 2003 Bhoomika Chawla Missamma \n15 2002 Kalyani Avunu Valliddaru Ista Paddaru \n16 2001 Laya[7] Preminchu \n17 2000 Laya[8] Manoharam \n18 1999 Maheswari Nee Kosam \n19 1998 Ramya Krishnan Kante Koothurne Kanu \n20 1998 Soundarya Antahpuram \n21 1997 Vijayashanti Osey Ramulamma \n22 1996 Soundarya Pavithra Bandham \n23 1995 Aamani Subha Sankalpam \n24 1994 Soundarya Ammoru \n25 1993 Aamani Mr. Pellam \n26 1992 Meena Rajeshwari Kalyanam \n27 1991 Sridevi Kshana Kshanam \n28 1990 VijayashantiMeena KarthavyamSeetharamaiah Gari Manavaralu \n29 1989 Vijayashanti Bharatha Naari \n30 1988 Bhanupriya Swarna Kamalam \n31 1987 Sumalatha Sruthilayalu \n32 1986 Lakshmi Sravana Megalu \n33 1985 Vijayashanti Pratighatana \n34 1984 Suhasini Swathi \n35 1983 Jayasudha Dharmaatmudu \n36 1982 Jayasudha Meghasandesam \n37 1981 Jayasudha Premabhishekam \n38 1980 Sakuntala Kukka \n39 1979 Jayasudha Idi Katha Kaadu \n40 1978 Roopa Naalaaga Endaro \n41 1977 Lakshmi[1] Pantulamma \n42 1976 Jayaprada[1] Anthuleni Kadha \n43 1975 Jayasudha[1] Jyothi \n" }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "run_control": { "frozen": false, "marked": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "dataDF<-as.data.frame(data)\nhead(dataDF)", "execution_count": null, "outputs": [] }, { "metadata": { "run_control": { "frozen": false, "marked": false, "read_only": false }, "scrolled": true, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "dataDF$NULL.V4 <- NULL\nnames(dataDF) <- c(\"Year\", \"Actress\", \"Film\")\ndim(dataDF)\nprint(dataDF)", "execution_count": 11, "outputs": [ { "data": { "text/html": "
    \n\t
  1. 43
  2. \n\t
  3. 3
  4. \n
\n", "text/latex": "\\begin{enumerate*}\n\\item 43\n\\item 3\n\\end{enumerate*}\n", "text/markdown": "1. 43\n2. 3\n\n\n", "text/plain": "[1] 43 3" }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": " Year Actress Film\n1 2016 Ritu Varma Pelli Choopulu\n2 2015 Anushka Shetty Size Zero\n3 2014 Anjali Geethanjali\n4 2013 Anjali Patil Naa Bangaaru Talli\n5 2012 Samantha Ruth Prabhu Yeto Vellipoyindhi Manasu\n6 2011 Nayantara Sri Rama Rajyam\n7 2010 Nithya Menen[5] Ala Modalaindi\n8 2009 Thirtha[6] Sontha Ooru\n9 2008 Swathi Ashta Chamma\n10 2007 Charmy Kaur Mantra\n11 2006 Nandita Das Kamli\n12 2005 Trisha Nuvvostanante Nenoddantana\n13 2004 Kamalinee Mukherjee Anand\n14 2003 Bhoomika Chawla Missamma\n15 2002 Kalyani Avunu Valliddaru Ista Paddaru\n16 2001 Laya[7] Preminchu\n17 2000 Laya[8] Manoharam\n18 1999 Maheswari Nee Kosam\n19 1998 Ramya Krishnan Kante Koothurne Kanu\n20 1998 Soundarya Antahpuram\n21 1997 Vijayashanti Osey Ramulamma\n22 1996 Soundarya Pavithra Bandham\n23 1995 Aamani Subha Sankalpam\n24 1994 Soundarya Ammoru\n25 1993 Aamani Mr. Pellam\n26 1992 Meena Rajeshwari Kalyanam\n27 1991 Sridevi Kshana Kshanam\n28 1990 VijayashantiMeena KarthavyamSeetharamaiah Gari Manavaralu\n29 1989 Vijayashanti Bharatha Naari\n30 1988 Bhanupriya Swarna Kamalam\n31 1987 Sumalatha Sruthilayalu\n32 1986 Lakshmi Sravana Megalu\n33 1985 Vijayashanti Pratighatana\n34 1984 Suhasini Swathi\n35 1983 Jayasudha Dharmaatmudu\n36 1982 Jayasudha Meghasandesam\n37 1981 Jayasudha Premabhishekam\n38 1980 Sakuntala Kukka\n39 1979 Jayasudha Idi Katha Kaadu\n40 1978 Roopa Naalaaga Endaro\n41 1977 Lakshmi[1] Pantulamma\n42 1976 Jayaprada[1] Anthuleni Kadha\n43 1975 Jayasudha[1] Jyothi\n" } ] }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "scrolled": true, "slideshow": { "slide_type": "skip" }, "trusted": false }, "cell_type": "code", "source": "# library (plyr)\n# df <- ldply (data, data.frame)\n# dim(df)\n# head(df)", "execution_count": 12, "outputs": [] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "**further reference:** https://www.apress.com/in/book/9781461478997 " }, { "metadata": {}, "cell_type": "markdown", "source": "### Using rvest package (web scrapping)\n\n**rvest in action** " }, { "metadata": { "collapsed": true, "run_control": { "frozen": false, "marked": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "library(rvest)\n# install.packages(\"xml2\", repos = \"http://cran.us.r-project.org\")\nlibrary(xml2,lib.loc=\"~/R/win-library/3.4\")\nlego_movie <- read_html(\"http://www.imdb.com/title/tt1490017/\")\n#print(lego_movie)", "execution_count": 26, "outputs": [] }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "lego_movie %>%\nhtml_node(\"strong span\") %>%\nhtml_text() %>%\nas.numeric()\n# selector:\nlego_movie %>%\nhtml_nodes(\"#titleCast .itemprop span\") %>%\nhtml_text()\n# The titles and authors of recent message board postings are stored in a the third table on the", "execution_count": 27, "outputs": [ { "name": "stderr", "output_type": "stream", "text": "Your code contains a unicode char which cannot be displayed in your\ncurrent locale and R will silently convert it to an escaped form when the\nR kernel executes this code. This can lead to subtle errors if you use\nsuch chars to do comparisons. For more information, please see\nhttps://github.com/IRkernel/repr/wiki/Problems-with-unicode-on-windows" }, { "data": { "text/html": "7.8", "text/latex": "7.8", "text/markdown": "7.8", "text/plain": "[1] 7.8" }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": "
    \n\t
  1. 'Will Arnett'
  2. \n\t
  3. 'Elizabeth Banks'
  4. \n\t
  5. 'Craig Berry'
  6. \n\t
  7. 'Alison Brie'
  8. \n\t
  9. 'David Burrows'
  10. \n\t
  11. 'Anthony Daniels'
  12. \n\t
  13. 'Charlie Day'
  14. \n\t
  15. 'Amanda Farinos'
  16. \n\t
  17. 'Keith Ferguson'
  18. \n\t
  19. 'Will Ferrell'
  20. \n\t
  21. 'Will Forte'
  22. \n\t
  23. 'Dave Franco'
  24. \n\t
  25. 'Morgan Freeman'
  26. \n\t
  27. 'Todd Hansen'
  28. \n\t
  29. 'Jonah Hill'
  30. \n
\n", "text/latex": "\\begin{enumerate*}\n\\item 'Will Arnett'\n\\item 'Elizabeth Banks'\n\\item 'Craig Berry'\n\\item 'Alison Brie'\n\\item 'David Burrows'\n\\item 'Anthony Daniels'\n\\item 'Charlie Day'\n\\item 'Amanda Farinos'\n\\item 'Keith Ferguson'\n\\item 'Will Ferrell'\n\\item 'Will Forte'\n\\item 'Dave Franco'\n\\item 'Morgan Freeman'\n\\item 'Todd Hansen'\n\\item 'Jonah Hill'\n\\end{enumerate*}\n", "text/markdown": "1. 'Will Arnett'\n2. 'Elizabeth Banks'\n3. 'Craig Berry'\n4. 'Alison Brie'\n5. 'David Burrows'\n6. 'Anthony Daniels'\n7. 'Charlie Day'\n8. 'Amanda Farinos'\n9. 'Keith Ferguson'\n10. 'Will Ferrell'\n11. 'Will Forte'\n12. 'Dave Franco'\n13. 'Morgan Freeman'\n14. 'Todd Hansen'\n15. 'Jonah Hill'\n\n\n", "text/plain": " [1] \"Will Arnett\" \"Elizabeth Banks\" \"Craig Berry\" \"Alison Brie\" \n [5] \"David Burrows\" \"Anthony Daniels\" \"Charlie Day\" \"Amanda Farinos\" \n [9] \"Keith Ferguson\" \"Will Ferrell\" \"Will Forte\" \"Dave Franco\" \n[13] \"Morgan Freeman\" \"Todd Hansen\" \"Jonah Hill\" " }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "slideshow": { "slide_type": "fragment" } }, "cell_type": "markdown", "source": "#### Import data from wikipedia (Scrape an HTML Table)\n* html_nodes: Select parts of an html document using css selectors\n* html_table: Parse tables into data frames with html_table().\n* read_html: Read in the content from a .html file (For xml2 and rvest packages.)\n* `browseVignettes()` to check the any sample codes of the concepts" }, { "metadata": { "collapsed": true, "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "obesity = read_html(\"https://en.wikipedia.org/wiki/Obesity_in_the_United_States\")\n#Using rvest to Scrape an HTML Table by taking table number\nobesity = obesity %>%\n html_nodes(\"table\") %>%\n .[[2]]%>%\n html_table(fill=T)", "execution_count": 28, "outputs": [] }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "scrolled": true, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "dim(obesity)\nhead(obesity)\ntail(obesity)", "execution_count": 29, "outputs": [ { "data": { "text/html": "
    \n\t
  1. 56
  2. \n\t
  3. 6
  4. \n
\n", "text/latex": "\\begin{enumerate*}\n\\item 56\n\\item 6\n\\end{enumerate*}\n", "text/markdown": "1. 56\n2. 6\n\n\n", "text/plain": "[1] 56 6" }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
States, District,\n& TerritoriesObese adults (mid-2000s)Obese adults (2016)[52][57]Overweight (incl. obese) adults\n(mid-2000s)Obese children and adolescents\n(mid-2000s)[58]Obesity rank
Alabama 30.1% 35.7% 65.4% 16.7% 3
Alaska 27.3% 31.4% 64.5% 11.1% 14
American Samoa75%[56] 95%[59] 35%[56][60]
Arizona 23.3% 29.0% 59.5% 12.2% 40
Arkansas 28.1% 35.7% 64.7% 16.4% 9
California 23.1% 25.0% 59.4% 13.2% 41
\n", "text/latex": "\\begin{tabular}{r|llllll}\n States, District,\n\\& Territories & Obese adults (mid-2000s) & Obese adults (2016){[}52{]}{[}57{]} & Overweight (incl. obese) adults\n(mid-2000s) & Obese children and adolescents\n(mid-2000s){[}58{]} & Obesity rank\\\\\n\\hline\n\t Alabama & 30.1\\% & 35.7\\% & 65.4\\% & 16.7\\% & 3 \\\\\n\t Alaska & 27.3\\% & 31.4\\% & 64.5\\% & 11.1\\% & 14 \\\\\n\t American Samoa & — & 75\\%{[}56{]} & 95\\%{[}59{]} & 35\\%{[}56{]}{[}60{]} & — \\\\\n\t Arizona & 23.3\\% & 29.0\\% & 59.5\\% & 12.2\\% & 40 \\\\\n\t Arkansas & 28.1\\% & 35.7\\% & 64.7\\% & 16.4\\% & 9 \\\\\n\t California & 23.1\\% & 25.0\\% & 59.4\\% & 13.2\\% & 41 \\\\\n\\end{tabular}\n", "text/markdown": "\nStates, District,\n& Territories | Obese adults (mid-2000s) | Obese adults (2016)[52][57] | Overweight (incl. obese) adults\n(mid-2000s) | Obese children and adolescents\n(mid-2000s)[58] | Obesity rank | \n|---|---|---|---|---|---|\n| Alabama | 30.1% | 35.7% | 65.4% | 16.7% | 3 | \n| Alaska | 27.3% | 31.4% | 64.5% | 11.1% | 14 | \n| American Samoa | — | 75%[56] | 95%[59] | 35%[56][60] | — | \n| Arizona | 23.3% | 29.0% | 59.5% | 12.2% | 40 | \n| Arkansas | 28.1% | 35.7% | 64.7% | 16.4% | 9 | \n| California | 23.1% | 25.0% | 59.4% | 13.2% | 41 | \n\n\n", "text/plain": " States, District,\\n& Territories Obese adults (mid-2000s)\n1 Alabama 30.1% \n2 Alaska 27.3% \n3 American Samoa — \n4 Arizona 23.3% \n5 Arkansas 28.1% \n6 California 23.1% \n Obese adults (2016)[52][57] Overweight (incl. obese) adults\\n(mid-2000s)\n1 35.7% 65.4% \n2 31.4% 64.5% \n3 75%[56] 95%[59] \n4 29.0% 59.5% \n5 35.7% 64.7% \n6 25.0% 59.4% \n Obese children and adolescents\\n(mid-2000s)[58] Obesity rank\n1 16.7% 3 \n2 11.1% 14 \n3 35%[56][60] — \n4 12.2% 40 \n5 16.4% 9 \n6 13.2% 41 " }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
States, District,\n& TerritoriesObese adults (mid-2000s)Obese adults (2016)[52][57]Overweight (incl. obese) adults\n(mid-2000s)Obese children and adolescents\n(mid-2000s)[58]Obesity rank
51Virgin Islands (U.S.)32.5%
52Virginia 25.2% 29.0% 61.6% 13.8% 27
53Washington 24.5% 28.6% 60.7% 10.8% 32
54West Virginia 30.6% 37.7% 66.8% 20.9% 2
55Wisconsin 25.5% 30.7% 62.4% 13.5% 25
56Wyoming 24.0% 27.7% 61.7% 8.7% 33
\n", "text/latex": "\\begin{tabular}{r|llllll}\n & States, District,\n\\& Territories & Obese adults (mid-2000s) & Obese adults (2016){[}52{]}{[}57{]} & Overweight (incl. obese) adults\n(mid-2000s) & Obese children and adolescents\n(mid-2000s){[}58{]} & Obesity rank\\\\\n\\hline\n\t51 & Virgin Islands (U.S.) & — & 32.5\\% & — & — & — \\\\\n\t52 & Virginia & 25.2\\% & 29.0\\% & 61.6\\% & 13.8\\% & 27 \\\\\n\t53 & Washington & 24.5\\% & 28.6\\% & 60.7\\% & 10.8\\% & 32 \\\\\n\t54 & West Virginia & 30.6\\% & 37.7\\% & 66.8\\% & 20.9\\% & 2 \\\\\n\t55 & Wisconsin & 25.5\\% & 30.7\\% & 62.4\\% & 13.5\\% & 25 \\\\\n\t56 & Wyoming & 24.0\\% & 27.7\\% & 61.7\\% & 8.7\\% & 33 \\\\\n\\end{tabular}\n", "text/markdown": "\n| | States, District,\n& Territories | Obese adults (mid-2000s) | Obese adults (2016)[52][57] | Overweight (incl. obese) adults\n(mid-2000s) | Obese children and adolescents\n(mid-2000s)[58] | Obesity rank | \n|---|---|---|---|---|---|\n| 51 | Virgin Islands (U.S.) | — | 32.5% | — | — | — | \n| 52 | Virginia | 25.2% | 29.0% | 61.6% | 13.8% | 27 | \n| 53 | Washington | 24.5% | 28.6% | 60.7% | 10.8% | 32 | \n| 54 | West Virginia | 30.6% | 37.7% | 66.8% | 20.9% | 2 | \n| 55 | Wisconsin | 25.5% | 30.7% | 62.4% | 13.5% | 25 | \n| 56 | Wyoming | 24.0% | 27.7% | 61.7% | 8.7% | 33 | \n\n\n", "text/plain": " States, District,\\n& Territories Obese adults (mid-2000s)\n51 Virgin Islands (U.S.) — \n52 Virginia 25.2% \n53 Washington 24.5% \n54 West Virginia 30.6% \n55 Wisconsin 25.5% \n56 Wyoming 24.0% \n Obese adults (2016)[52][57] Overweight (incl. obese) adults\\n(mid-2000s)\n51 32.5% — \n52 29.0% 61.6% \n53 28.6% 60.7% \n54 37.7% 66.8% \n55 30.7% 62.4% \n56 27.7% 61.7% \n Obese children and adolescents\\n(mid-2000s)[58] Obesity rank\n51 — — \n52 13.8% 27 \n53 10.8% 32 \n54 20.9% 2 \n55 13.5% 25 \n56 8.7% 33 " }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "#### Import data from wikipedia (Using xpath)" }, { "metadata": { "slideshow": { "slide_type": "subslide" } }, "cell_type": "markdown", "source": "**Steps** \n1. *browse to the desired page and locate the table*\n2. *right clicked on the table you want and chose “inspect element”*\n2. *choose “copy XPath”*\n4. *Paste that XPath into the appropriate spot as shown in the code below*" }, { "metadata": { "slideshow": { "slide_type": "slide" }, "trusted": false }, "cell_type": "code", "source": "library(magrittr) # for the pipe operator\nlibrary(xml2) # for read_html() function\nlibrary(rvest) # for html_nodes() and html_table()", "execution_count": 3, "outputs": [] }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "scrolled": false, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "url <- \"https://en.wikipedia.org/wiki/List_of_districts_in_Telangana\"\ndistricts <- url %>%\n read_html() %>%\n html_nodes(xpath='//*[@id=\"mw-content-text\"]/div/table') %>%\n html_table()\ndistricts <- districts[[1]]\n\nhead(districts)\ntail(districts)\n# //*[@id=\"mw-content-text\"]/div/table[1]/tbody/tr[7]/td[3]", "execution_count": 4, "outputs": [ { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
S.No.NameHeadquartersArea (km2)Population\n(2011 census)No.of mandalsDensity\n(per km2)Urban (%)Literacy (%)Sex Ratio
1 Adilabad Adilabad 4,153 708,972 18 171 23.66 63.46 989
2 Bhadradri Kothagudem Kothagudem 7,483 1,069,261 23 143 31.71 66.40 1008
3 Hyderabad Hyderabad 217 3,943,323 16 18172 100.00 83.25 954
4 Jagtial Jagtial 2,419 985,417 18 407 22.46 60.26 1036
5 Jangaon Jangaon 2,188 566,376 13 259 12.60 61.44 997
6 Jayashankar BhupalapallyBhupalpalle 6,175 711,434 20 115 7.57 60.33 1009
\n", "text/latex": "\\begin{tabular}{r|llllllllll}\n S.No. & Name & Headquarters & Area (km2) & Population\n(2011 census) & No.of mandals & Density\n(per km2) & Urban (\\%) & Literacy (\\%) & Sex Ratio\\\\\n\\hline\n\t 1 & Adilabad & Adilabad & 4,153 & 708,972 & 18 & 171 & 23.66 & 63.46 & 989 \\\\\n\t 2 & Bhadradri Kothagudem & Kothagudem & 7,483 & 1,069,261 & 23 & 143 & 31.71 & 66.40 & 1008 \\\\\n\t 3 & Hyderabad & Hyderabad & 217 & 3,943,323 & 16 & 18172 & 100.00 & 83.25 & 954 \\\\\n\t 4 & Jagtial & Jagtial & 2,419 & 985,417 & 18 & 407 & 22.46 & 60.26 & 1036 \\\\\n\t 5 & Jangaon & Jangaon & 2,188 & 566,376 & 13 & 259 & 12.60 & 61.44 & 997 \\\\\n\t 6 & Jayashankar Bhupalapally & Bhupalpalle & 6,175 & 711,434 & 20 & 115 & 7.57 & 60.33 & 1009 \\\\\n\\end{tabular}\n", "text/markdown": "\nS.No. | Name | Headquarters | Area (km2) | Population\n(2011 census) | No.of mandals | Density\n(per km2) | Urban (%) | Literacy (%) | Sex Ratio | \n|---|---|---|---|---|---|\n| 1 | Adilabad | Adilabad | 4,153 | 708,972 | 18 | 171 | 23.66 | 63.46 | 989 | \n| 2 | Bhadradri Kothagudem | Kothagudem | 7,483 | 1,069,261 | 23 | 143 | 31.71 | 66.40 | 1008 | \n| 3 | Hyderabad | Hyderabad | 217 | 3,943,323 | 16 | 18172 | 100.00 | 83.25 | 954 | \n| 4 | Jagtial | Jagtial | 2,419 | 985,417 | 18 | 407 | 22.46 | 60.26 | 1036 | \n| 5 | Jangaon | Jangaon | 2,188 | 566,376 | 13 | 259 | 12.60 | 61.44 | 997 | \n| 6 | Jayashankar Bhupalapally | Bhupalpalle | 6,175 | 711,434 | 20 | 115 | 7.57 | 60.33 | 1009 | \n\n\n", "text/plain": " S.No. Name Headquarters Area (km2)\n1 1 Adilabad Adilabad 4,153 \n2 2 Bhadradri Kothagudem Kothagudem 7,483 \n3 3 Hyderabad Hyderabad 217 \n4 4 Jagtial Jagtial 2,419 \n5 5 Jangaon Jangaon 2,188 \n6 6 Jayashankar Bhupalapally Bhupalpalle 6,175 \n Population\\n(2011 census) No.of mandals Density\\n(per km2) Urban (%)\n1 708,972 18 171 23.66 \n2 1,069,261 23 143 31.71 \n3 3,943,323 16 18172 100.00 \n4 985,417 18 407 22.46 \n5 566,376 13 259 12.60 \n6 711,434 20 115 7.57 \n Literacy (%) Sex Ratio\n1 63.46 989 \n2 66.40 1008 \n3 83.25 954 \n4 60.26 1036 \n5 61.44 997 \n6 60.33 1009 " }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
S.No.NameHeadquartersArea (km2)Population\n(2011 census)No.of mandalsDensity\n(per km2)Urban (%)Literacy (%)Sex Ratio
2727 Vikarabad[9] Vikarabad 3,386 927,140 18 274 13.48 57.91 1001
2828 Wanaparthy Wanaparthy 2,152 577,758 14 268 15.97 55.67 960
2929 Warangal Rural Warangal 2,175 718,537 15 330 6.99 61.26 994
3030 Warangal Urban Warangal 1,309 1,080,858 11 826 68.51 76.17 997
3131 Yadadri BhuvanagiriBhongir 3,092 739,448 16 239 16.66 65.53 973
32Telangana - - 112,077 35,003,674 584 312 38.88 66.54 988
\n", "text/latex": "\\begin{tabular}{r|llllllllll}\n & S.No. & Name & Headquarters & Area (km2) & Population\n(2011 census) & No.of mandals & Density\n(per km2) & Urban (\\%) & Literacy (\\%) & Sex Ratio\\\\\n\\hline\n\t27 & 27 & Vikarabad{[}9{]} & Vikarabad & 3,386 & 927,140 & 18 & 274 & 13.48 & 57.91 & 1001 \\\\\n\t28 & 28 & Wanaparthy & Wanaparthy & 2,152 & 577,758 & 14 & 268 & 15.97 & 55.67 & 960 \\\\\n\t29 & 29 & Warangal Rural & Warangal & 2,175 & 718,537 & 15 & 330 & 6.99 & 61.26 & 994 \\\\\n\t30 & 30 & Warangal Urban & Warangal & 1,309 & 1,080,858 & 11 & 826 & 68.51 & 76.17 & 997 \\\\\n\t31 & 31 & Yadadri Bhuvanagiri & Bhongir & 3,092 & 739,448 & 16 & 239 & 16.66 & 65.53 & 973 \\\\\n\t32 & Telangana & - & - & 112,077 & 35,003,674 & 584 & 312 & 38.88 & 66.54 & 988 \\\\\n\\end{tabular}\n", "text/markdown": "\n| | S.No. | Name | Headquarters | Area (km2) | Population\n(2011 census) | No.of mandals | Density\n(per km2) | Urban (%) | Literacy (%) | Sex Ratio | \n|---|---|---|---|---|---|\n| 27 | 27 | Vikarabad[9] | Vikarabad | 3,386 | 927,140 | 18 | 274 | 13.48 | 57.91 | 1001 | \n| 28 | 28 | Wanaparthy | Wanaparthy | 2,152 | 577,758 | 14 | 268 | 15.97 | 55.67 | 960 | \n| 29 | 29 | Warangal Rural | Warangal | 2,175 | 718,537 | 15 | 330 | 6.99 | 61.26 | 994 | \n| 30 | 30 | Warangal Urban | Warangal | 1,309 | 1,080,858 | 11 | 826 | 68.51 | 76.17 | 997 | \n| 31 | 31 | Yadadri Bhuvanagiri | Bhongir | 3,092 | 739,448 | 16 | 239 | 16.66 | 65.53 | 973 | \n| 32 | Telangana | - | - | 112,077 | 35,003,674 | 584 | 312 | 38.88 | 66.54 | 988 | \n\n\n", "text/plain": " S.No. Name Headquarters Area (km2)\n27 27 Vikarabad[9] Vikarabad 3,386 \n28 28 Wanaparthy Wanaparthy 2,152 \n29 29 Warangal Rural Warangal 2,175 \n30 30 Warangal Urban Warangal 1,309 \n31 31 Yadadri Bhuvanagiri Bhongir 3,092 \n32 Telangana - - 112,077 \n Population\\n(2011 census) No.of mandals Density\\n(per km2) Urban (%)\n27 927,140 18 274 13.48 \n28 577,758 14 268 15.97 \n29 718,537 15 330 6.99 \n30 1,080,858 11 826 68.51 \n31 739,448 16 239 16.66 \n32 35,003,674 584 312 38.88 \n Literacy (%) Sex Ratio\n27 57.91 1001 \n28 55.67 960 \n29 61.26 994 \n30 76.17 997 \n31 65.53 973 \n32 66.54 988 " }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "**Further References:** https://stat4701.github.io/edav/2015/04/02/rvest_tutorial/" }, { "metadata": { "hide_input": false, "run_control": { "marked": false }, "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "## Importing Data From Other Statistical softwares\n* Packages\n * haven\n * foreign\n * Hmisc \n* From SAS\n* From Stata\n* From SPSS" }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "![Imgur](https://i.imgur.com/hRvafeg.png)" }, { "metadata": { "slideshow": { "slide_type": "subslide" } }, "cell_type": "markdown", "source": "### haven package" }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "#### From SAS" }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "fragment" }, "trusted": false }, "cell_type": "code", "source": "# install.packages(\"haven\", repos = \"http://cran.us.r-project.org\")\nlibrary(haven)\nontime <- read_sas(\"data/sales.sas7bdat\")\nhead(ontime)", "execution_count": 31, "outputs": [ { "name": "stderr", "output_type": "stream", "text": "Warning message:\n\"package 'haven' was built under R version 3.4.3\"" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
purchaseagegenderincome
0 41 FemaleLow
0 47 FemaleLow
1 41 FemaleLow
1 39 FemaleLow
0 32 FemaleLow
0 32 FemaleLow
\n", "text/latex": "\\begin{tabular}{r|llll}\n purchase & age & gender & income\\\\\n\\hline\n\t 0 & 41 & Female & Low \\\\\n\t 0 & 47 & Female & Low \\\\\n\t 1 & 41 & Female & Low \\\\\n\t 1 & 39 & Female & Low \\\\\n\t 0 & 32 & Female & Low \\\\\n\t 0 & 32 & Female & Low \\\\\n\\end{tabular}\n", "text/markdown": "\npurchase | age | gender | income | \n|---|---|---|---|---|---|\n| 0 | 41 | Female | Low | \n| 0 | 47 | Female | Low | \n| 1 | 41 | Female | Low | \n| 1 | 39 | Female | Low | \n| 0 | 32 | Female | Low | \n| 0 | 32 | Female | Low | \n\n\n", "text/plain": " purchase age gender income\n1 0 41 Female Low \n2 0 47 Female Low \n3 1 41 Female Low \n4 1 39 Female Low \n5 0 32 Female Low \n6 0 32 Female Low " }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "#### From Stata" }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "fragment" }, "trusted": false }, "cell_type": "code", "source": "trade <- read_stata(\"data/trade.dta\")\nhead(trade)\ntrade <- read_dta(\"data/trade.dta\")\nhead(trade)", "execution_count": 32, "outputs": [ { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
DateImportWeight_IExportWeight_E
10 37664782 54029106 54505513 93350013
9 16316512 21584365 102700010158000010
8 11082246 14526089 37935000 88000000
7 35677943 55034932 48515008112000005
6 9879878 14806865 71486545131800000
5 1539992 1749318 12311696 18500014
\n", "text/latex": "\\begin{tabular}{r|lllll}\n Date & Import & Weight\\_I & Export & Weight\\_E\\\\\n\\hline\n\t 10 & 37664782 & 54029106 & 54505513 & 93350013\\\\\n\t 9 & 16316512 & 21584365 & 102700010 & 158000010\\\\\n\t 8 & 11082246 & 14526089 & 37935000 & 88000000\\\\\n\t 7 & 35677943 & 55034932 & 48515008 & 112000005\\\\\n\t 6 & 9879878 & 14806865 & 71486545 & 131800000\\\\\n\t 5 & 1539992 & 1749318 & 12311696 & 18500014\\\\\n\\end{tabular}\n", "text/markdown": "\nDate | Import | Weight_I | Export | Weight_E | \n|---|---|---|---|---|---|\n| 10 | 37664782 | 54029106 | 54505513 | 93350013 | \n| 9 | 16316512 | 21584365 | 102700010 | 158000010 | \n| 8 | 11082246 | 14526089 | 37935000 | 88000000 | \n| 7 | 35677943 | 55034932 | 48515008 | 112000005 | \n| 6 | 9879878 | 14806865 | 71486545 | 131800000 | \n| 5 | 1539992 | 1749318 | 12311696 | 18500014 | \n\n\n", "text/plain": " Date Import Weight_I Export Weight_E \n1 10 37664782 54029106 54505513 93350013\n2 9 16316512 21584365 102700010 158000010\n3 8 11082246 14526089 37935000 88000000\n4 7 35677943 55034932 48515008 112000005\n5 6 9879878 14806865 71486545 131800000\n6 5 1539992 1749318 12311696 18500014" }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
DateImportWeight_IExportWeight_E
10 37664782 54029106 54505513 93350013
9 16316512 21584365 102700010158000010
8 11082246 14526089 37935000 88000000
7 35677943 55034932 48515008112000005
6 9879878 14806865 71486545131800000
5 1539992 1749318 12311696 18500014
\n", "text/latex": "\\begin{tabular}{r|lllll}\n Date & Import & Weight\\_I & Export & Weight\\_E\\\\\n\\hline\n\t 10 & 37664782 & 54029106 & 54505513 & 93350013\\\\\n\t 9 & 16316512 & 21584365 & 102700010 & 158000010\\\\\n\t 8 & 11082246 & 14526089 & 37935000 & 88000000\\\\\n\t 7 & 35677943 & 55034932 & 48515008 & 112000005\\\\\n\t 6 & 9879878 & 14806865 & 71486545 & 131800000\\\\\n\t 5 & 1539992 & 1749318 & 12311696 & 18500014\\\\\n\\end{tabular}\n", "text/markdown": "\nDate | Import | Weight_I | Export | Weight_E | \n|---|---|---|---|---|---|\n| 10 | 37664782 | 54029106 | 54505513 | 93350013 | \n| 9 | 16316512 | 21584365 | 102700010 | 158000010 | \n| 8 | 11082246 | 14526089 | 37935000 | 88000000 | \n| 7 | 35677943 | 55034932 | 48515008 | 112000005 | \n| 6 | 9879878 | 14806865 | 71486545 | 131800000 | \n| 5 | 1539992 | 1749318 | 12311696 | 18500014 | \n\n\n", "text/plain": " Date Import Weight_I Export Weight_E \n1 10 37664782 54029106 54505513 93350013\n2 9 16316512 21584365 102700010 158000010\n3 8 11082246 14526089 37935000 88000000\n4 7 35677943 55034932 48515008 112000005\n5 6 9879878 14806865 71486545 131800000\n6 5 1539992 1749318 12311696 18500014" }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "#### From SPSS" }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "fragment" }, "trusted": false }, "cell_type": "code", "source": "Anxiety <- read_spss(\"data/Anxiety 2.sav\")\nAnxiety", "execution_count": 33, "outputs": [ { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\n
subjectanxietytensiontrial1trial2trial3trial4
11 1 1814126
21 1 1912 84
31 1 1410 62
41 2 1612104
51 2 12 8 62
61 2 1810 51
72 1 1610 84
82 1 18 8 41
92 1 1612 62
102 2 1916108
112 2 1614109
122 2 1612 88
\n", "text/latex": "\\begin{tabular}{r|lllllll}\n subject & anxiety & tension & trial1 & trial2 & trial3 & trial4\\\\\n\\hline\n\t 1 & 1 & 1 & 18 & 14 & 12 & 6 \\\\\n\t 2 & 1 & 1 & 19 & 12 & 8 & 4 \\\\\n\t 3 & 1 & 1 & 14 & 10 & 6 & 2 \\\\\n\t 4 & 1 & 2 & 16 & 12 & 10 & 4 \\\\\n\t 5 & 1 & 2 & 12 & 8 & 6 & 2 \\\\\n\t 6 & 1 & 2 & 18 & 10 & 5 & 1 \\\\\n\t 7 & 2 & 1 & 16 & 10 & 8 & 4 \\\\\n\t 8 & 2 & 1 & 18 & 8 & 4 & 1 \\\\\n\t 9 & 2 & 1 & 16 & 12 & 6 & 2 \\\\\n\t 10 & 2 & 2 & 19 & 16 & 10 & 8 \\\\\n\t 11 & 2 & 2 & 16 & 14 & 10 & 9 \\\\\n\t 12 & 2 & 2 & 16 & 12 & 8 & 8 \\\\\n\\end{tabular}\n", "text/markdown": "\nsubject | anxiety | tension | trial1 | trial2 | trial3 | trial4 | \n|---|---|---|---|---|---|---|---|---|---|---|---|\n| 1 | 1 | 1 | 18 | 14 | 12 | 6 | \n| 2 | 1 | 1 | 19 | 12 | 8 | 4 | \n| 3 | 1 | 1 | 14 | 10 | 6 | 2 | \n| 4 | 1 | 2 | 16 | 12 | 10 | 4 | \n| 5 | 1 | 2 | 12 | 8 | 6 | 2 | \n| 6 | 1 | 2 | 18 | 10 | 5 | 1 | \n| 7 | 2 | 1 | 16 | 10 | 8 | 4 | \n| 8 | 2 | 1 | 18 | 8 | 4 | 1 | \n| 9 | 2 | 1 | 16 | 12 | 6 | 2 | \n| 10 | 2 | 2 | 19 | 16 | 10 | 8 | \n| 11 | 2 | 2 | 16 | 14 | 10 | 9 | \n| 12 | 2 | 2 | 16 | 12 | 8 | 8 | \n\n\n", "text/plain": " subject anxiety tension trial1 trial2 trial3 trial4\n1 1 1 1 18 14 12 6 \n2 2 1 1 19 12 8 4 \n3 3 1 1 14 10 6 2 \n4 4 1 2 16 12 10 4 \n5 5 1 2 12 8 6 2 \n6 6 1 2 18 10 5 1 \n7 7 2 1 16 10 8 4 \n8 8 2 1 18 8 4 1 \n9 9 2 1 16 12 6 2 \n10 10 2 2 19 16 10 8 \n11 11 2 2 16 14 10 9 \n12 12 2 2 16 12 8 8 " }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "### foreign package" }, { "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "library(foreign)\n# from SAS\n#nstall.packages(\"sas7bdat\", repos = \"http://cran.us.r-project.org\")\nlibrary(sas7bdat)\nmydata <- read.sas7bdat(\"data/sales.sas7bdat\") \nhead(mydata)\n# from stata\nmydata <- read.dta(\"data/trade.dta\")\nhead(mydata)\n# from SPSS\nmySPSSData <- read.spss(\"data/Anxiety 2.sav\",\n to.data.frame=TRUE,\n use.value.labels=FALSE)\nhead(mySPSSData)", "execution_count": 34, "outputs": [ { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
purchaseagegenderincome
0 41 FemaleLow
0 47 FemaleLow
1 41 FemaleLow
1 39 FemaleLow
0 32 FemaleLow
0 32 FemaleLow
\n", "text/latex": "\\begin{tabular}{r|llll}\n purchase & age & gender & income\\\\\n\\hline\n\t 0 & 41 & Female & Low \\\\\n\t 0 & 47 & Female & Low \\\\\n\t 1 & 41 & Female & Low \\\\\n\t 1 & 39 & Female & Low \\\\\n\t 0 & 32 & Female & Low \\\\\n\t 0 & 32 & Female & Low \\\\\n\\end{tabular}\n", "text/markdown": "\npurchase | age | gender | income | \n|---|---|---|---|---|---|\n| 0 | 41 | Female | Low | \n| 0 | 47 | Female | Low | \n| 1 | 41 | Female | Low | \n| 1 | 39 | Female | Low | \n| 0 | 32 | Female | Low | \n| 0 | 32 | Female | Low | \n\n\n", "text/plain": " purchase age gender income\n1 0 41 Female Low \n2 0 47 Female Low \n3 1 41 Female Low \n4 1 39 Female Low \n5 0 32 Female Low \n6 0 32 Female Low " }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
DateImportWeight_IExportWeight_E
2013-12-3137664782 54029106 54505513 93350013
2012-12-3116316512 21584365 102700010 158000010
2011-12-3111082246 14526089 37935000 88000000
2010-12-3135677943 55034932 48515008 112000005
2009-12-31 9879878 14806865 71486545 131800000
2008-12-31 1539992 1749318 12311696 18500014
\n", "text/latex": "\\begin{tabular}{r|lllll}\n Date & Import & Weight\\_I & Export & Weight\\_E\\\\\n\\hline\n\t 2013-12-31 & 37664782 & 54029106 & 54505513 & 93350013 \\\\\n\t 2012-12-31 & 16316512 & 21584365 & 102700010 & 158000010 \\\\\n\t 2011-12-31 & 11082246 & 14526089 & 37935000 & 88000000 \\\\\n\t 2010-12-31 & 35677943 & 55034932 & 48515008 & 112000005 \\\\\n\t 2009-12-31 & 9879878 & 14806865 & 71486545 & 131800000 \\\\\n\t 2008-12-31 & 1539992 & 1749318 & 12311696 & 18500014 \\\\\n\\end{tabular}\n", "text/markdown": "\nDate | Import | Weight_I | Export | Weight_E | \n|---|---|---|---|---|---|\n| 2013-12-31 | 37664782 | 54029106 | 54505513 | 93350013 | \n| 2012-12-31 | 16316512 | 21584365 | 102700010 | 158000010 | \n| 2011-12-31 | 11082246 | 14526089 | 37935000 | 88000000 | \n| 2010-12-31 | 35677943 | 55034932 | 48515008 | 112000005 | \n| 2009-12-31 | 9879878 | 14806865 | 71486545 | 131800000 | \n| 2008-12-31 | 1539992 | 1749318 | 12311696 | 18500014 | \n\n\n", "text/plain": " Date Import Weight_I Export Weight_E \n1 2013-12-31 37664782 54029106 54505513 93350013\n2 2012-12-31 16316512 21584365 102700010 158000010\n3 2011-12-31 11082246 14526089 37935000 88000000\n4 2010-12-31 35677943 55034932 48515008 112000005\n5 2009-12-31 9879878 14806865 71486545 131800000\n6 2008-12-31 1539992 1749318 12311696 18500014" }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": "re-encoding from CP1252\n" }, { "data": { "text/html": "\n\n\n\t\n\t\n\t\n\t\n\t\n\t\n\n
subjectanxietytensiontrial1trial2trial3trial4
1 1 1 1814126
2 1 1 1912 84
3 1 1 1410 62
4 1 2 1612104
5 1 2 12 8 62
6 1 2 1810 51
\n", "text/latex": "\\begin{tabular}{r|lllllll}\n subject & anxiety & tension & trial1 & trial2 & trial3 & trial4\\\\\n\\hline\n\t 1 & 1 & 1 & 18 & 14 & 12 & 6 \\\\\n\t 2 & 1 & 1 & 19 & 12 & 8 & 4 \\\\\n\t 3 & 1 & 1 & 14 & 10 & 6 & 2 \\\\\n\t 4 & 1 & 2 & 16 & 12 & 10 & 4 \\\\\n\t 5 & 1 & 2 & 12 & 8 & 6 & 2 \\\\\n\t 6 & 1 & 2 & 18 & 10 & 5 & 1 \\\\\n\\end{tabular}\n", "text/markdown": "\nsubject | anxiety | tension | trial1 | trial2 | trial3 | trial4 | \n|---|---|---|---|---|---|\n| 1 | 1 | 1 | 18 | 14 | 12 | 6 | \n| 2 | 1 | 1 | 19 | 12 | 8 | 4 | \n| 3 | 1 | 1 | 14 | 10 | 6 | 2 | \n| 4 | 1 | 2 | 16 | 12 | 10 | 4 | \n| 5 | 1 | 2 | 12 | 8 | 6 | 2 | \n| 6 | 1 | 2 | 18 | 10 | 5 | 1 | \n\n\n", "text/plain": " subject anxiety tension trial1 trial2 trial3 trial4\n1 1 1 1 18 14 12 6 \n2 2 1 1 19 12 8 4 \n3 3 1 1 14 10 6 2 \n4 4 1 2 16 12 10 4 \n5 5 1 2 12 8 6 2 \n6 6 1 2 18 10 5 1 " }, "metadata": {}, "output_type": "display_data" } ] }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "### Hmisc package" }, { "metadata": { "collapsed": true, "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "subslide" }, "trusted": false }, "cell_type": "code", "source": "# from SAS\n# from stata\n# from SPSS", "execution_count": 35, "outputs": [] }, { "metadata": {}, "cell_type": "markdown", "source": "**This R Data Import Tutorial Is Everything You Need**\nhttps://www.datacamp.com/community/tutorials/r-data-import-tutorial" }, { "metadata": { "slideshow": { "slide_type": "slide" } }, "cell_type": "markdown", "source": "## One package for basic import/export (`rio` **package**)" }, { "metadata": { "slideshow": { "slide_type": "subslide" } }, "cell_type": "markdown", "source": "```{r}library(\"rio\")\n# Importing data is handled with one function, import():\nx <- import(\"mtcars.csv\")\ny <- import(\"mtcars.rds\")\nz <- import(\"mtcars.sav\")\n\n# Exporting data is handled with one function, export():\n\nexport(mtcars, \"mtcars.csv\") # comma-separated values\nexport(mtcars, \"mtcars.rds\") # R serialized\nexport(mtcars, \"mtcars.sav\") # SPSS```" }, { "metadata": { "slideshow": { "slide_type": "subslide" } }, "cell_type": "markdown", "source": "**further references:** \nhttps://github.com/leeper/rio \nhttps://cran.r-project.org/web/packages/rio/vignettes/rio.html " } ], "metadata": { "celltoolbar": "Slideshow", "hide_input": false, "kernelspec": { "name": "r", "display_name": "R", "language": "R" }, "language_info": { "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.4.1", "file_extension": ".r", "codemirror_mode": "r" }, "nav_menu": {}, "toc": { "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "base_numbering": 1, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "475px", "left": "0px", "right": "1154px", "top": "111px", "width": "212px" }, "toc_section_display": "block", "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }