{ "cells": [ { "metadata": {}, "cell_type": "markdown", "source": "# Quickstart" }, { "metadata": {}, "cell_type": "markdown", "source": "To start working with Kotlin DataFrame in a notebook, run the cell with the next code:" }, { "cell_type": "code", "metadata": { "collapsed": true, "ExecuteTime": { "end_time": "2025-05-27T17:21:01.955626Z", "start_time": "2025-05-27T17:21:00.796448Z" } }, "source": [ "%useLatestDescriptors\n", "%use dataframe@kc25" ], "outputs": [], "execution_count": 9 }, { "metadata": {}, "cell_type": "markdown", "source": "This will load all necessary DataFrame dependencies (of the latest stable version) and all imports, as well as DataFrame rendering. Learn more [here](https://kotlin.github.io/dataframe/gettingstartedkotlinnotebook.html#integrate-kotlin-dataframe)." }, { "metadata": {}, "cell_type": "markdown", "source": "## Read DataFrame" }, { "metadata": {}, "cell_type": "markdown", "source": "Kotlin DataFrame supports all popular data formats, including CSV, JSON and Excel, as well as reading from various databases. Read a CSV with the \"Jetbrains Repositories\" dataset into `df` variable:" }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:02.672893Z", "start_time": "2025-05-27T17:21:01.959987Z" } }, "cell_type": "code", "source": [ "val df = DataFrame.readCsv(\n", " \"https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv\"\n", ")" ], "outputs": [], "execution_count": 10 }, { "metadata": {}, "cell_type": "markdown", "source": "## Display And Explore" }, { "metadata": {}, "cell_type": "markdown", "source": "To display your dataframe as a cell output, place it in the last line of the cell:" }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:02.809925Z", "start_time": "2025-05-27T17:21:02.676858Z" } }, "cell_type": "code", "source": "df", "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
full_namehtml_urlstargazers_counttopicswatchers
JetBrains/JPShttps://github.com/JetBrains/JPS23[]23
JetBrains/YouTrackSharphttps://github.com/JetBrains/YouTrack...115[jetbrains, jetbrains-youtrack, youtr...115
JetBrains/colorSchemeToolhttps://github.com/JetBrains/colorSch...290[]290
JetBrains/ideavimhttps://github.com/JetBrains/ideavim6120[ideavim, intellij, intellij-platform...6120
JetBrains/youtrack-vcs-hookshttps://github.com/JetBrains/youtrack...5[]5
JetBrains/youtrack-rest-ruby-libraryhttps://github.com/JetBrains/youtrack...8[]8
JetBrains/emacs4ijhttps://github.com/JetBrains/emacs4ij47[]47
JetBrains/codereview4intellijhttps://github.com/JetBrains/coderevi...11[]11
JetBrains/teamcity-nuget-supporthttps://github.com/JetBrains/teamcity...41[nuget, nuget-feed, teamcity, teamcit...41
JetBrains/Grammar-Kithttps://github.com/JetBrains/Grammar-Kit534[]534
JetBrains/intellij-starteam-pluginhttps://github.com/JetBrains/intellij...6[]6
JetBrains/la-clojurehttps://github.com/JetBrains/la-clojure218[]218
JetBrains/MPShttps://github.com/JetBrains/MPS1241[domain-specific-language, dsl]1241
JetBrains/intellij-communityhttps://github.com/JetBrains/intellij...12926[code-editor, ide, intellij, intellij...12926
JetBrains/TeamCity.ServiceMessageshttps://github.com/JetBrains/TeamCity...39[c-sharp, teamcity, teamcity-service-...39
JetBrains/youtrack-rest-python-libraryhttps://github.com/JetBrains/youtrack...118[]118
JetBrains/intellij-scalahttps://github.com/JetBrains/intellij...1066[intellij-idea, intellij-plugin, scala]1066
JetBrains/teamcity-messageshttps://github.com/JetBrains/teamcity...125[]125
JetBrains/teamcity-cpphttps://github.com/JetBrains/teamcity...27[]27
JetBrains/kotlinhttps://github.com/JetBrains/kotlin39402[compiler, gradle-plugin, intellij-pl...39402
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"full_name\",\"html_url\",\"stargazers_count\",\"topics\",\"watchers\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"java.net.URL\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"}],\"nrow\":562,\"ncol\":5},\"kotlin_dataframe\":[{\"full_name\":\"JetBrains/JPS\",\"html_url\":\"https://github.com/JetBrains/JPS\",\"stargazers_count\":23,\"topics\":\"[]\",\"watchers\":23},{\"full_name\":\"JetBrains/YouTrackSharp\",\"html_url\":\"https://github.com/JetBrains/YouTrackSharp\",\"stargazers_count\":115,\"topics\":\"[jetbrains, jetbrains-youtrack, youtrack, youtrack-api]\",\"watchers\":115},{\"full_name\":\"JetBrains/colorSchemeTool\",\"html_url\":\"https://github.com/JetBrains/colorSchemeTool\",\"stargazers_count\":290,\"topics\":\"[]\",\"watchers\":290},{\"full_name\":\"JetBrains/ideavim\",\"html_url\":\"https://github.com/JetBrains/ideavim\",\"stargazers_count\":6120,\"topics\":\"[ideavim, intellij, intellij-platform, jb-official, kotlin, vim, vim-emulator]\",\"watchers\":6120},{\"full_name\":\"JetBrains/youtrack-vcs-hooks\",\"html_url\":\"https://github.com/JetBrains/youtrack-vcs-hooks\",\"stargazers_count\":5,\"topics\":\"[]\",\"watchers\":5},{\"full_name\":\"JetBrains/youtrack-rest-ruby-library\",\"html_url\":\"https://github.com/JetBrains/youtrack-rest-ruby-library\",\"stargazers_count\":8,\"topics\":\"[]\",\"watchers\":8},{\"full_name\":\"JetBrains/emacs4ij\",\"html_url\":\"https://github.com/JetBrains/emacs4ij\",\"stargazers_count\":47,\"topics\":\"[]\",\"watchers\":47},{\"full_name\":\"JetBrains/codereview4intellij\",\"html_url\":\"https://github.com/JetBrains/codereview4intellij\",\"stargazers_count\":11,\"topics\":\"[]\",\"watchers\":11},{\"full_name\":\"JetBrains/teamcity-nuget-support\",\"html_url\":\"https://github.com/JetBrains/teamcity-nuget-support\",\"stargazers_count\":41,\"topics\":\"[nuget, nuget-feed, teamcity, teamcity-plugin]\",\"watchers\":41},{\"full_name\":\"JetBrains/Grammar-Kit\",\"html_url\":\"https://github.com/JetBrains/Grammar-Kit\",\"stargazers_count\":534,\"topics\":\"[]\",\"watchers\":534},{\"full_name\":\"JetBrains/intellij-starteam-plugin\",\"html_url\":\"https://github.com/JetBrains/intellij-starteam-plugin\",\"stargazers_count\":6,\"topics\":\"[]\",\"watchers\":6},{\"full_name\":\"JetBrains/la-clojure\",\"html_url\":\"https://github.com/JetBrains/la-clojure\",\"stargazers_count\":218,\"topics\":\"[]\",\"watchers\":218},{\"full_name\":\"JetBrains/MPS\",\"html_url\":\"https://github.com/JetBrains/MPS\",\"stargazers_count\":1241,\"topics\":\"[domain-specific-language, dsl]\",\"watchers\":1241},{\"full_name\":\"JetBrains/intellij-community\",\"html_url\":\"https://github.com/JetBrains/intellij-community\",\"stargazers_count\":12926,\"topics\":\"[code-editor, ide, intellij, intellij-community, intellij-platform]\",\"watchers\":12926},{\"full_name\":\"JetBrains/TeamCity.ServiceMessages\",\"html_url\":\"https://github.com/JetBrains/TeamCity.ServiceMessages\",\"stargazers_count\":39,\"topics\":\"[c-sharp, teamcity, teamcity-service-messages]\",\"watchers\":39},{\"full_name\":\"JetBrains/youtrack-rest-python-library\",\"html_url\":\"https://github.com/JetBrains/youtrack-rest-python-library\",\"stargazers_count\":118,\"topics\":\"[]\",\"watchers\":118},{\"full_name\":\"JetBrains/intellij-scala\",\"html_url\":\"https://github.com/JetBrains/intellij-scala\",\"stargazers_count\":1066,\"topics\":\"[intellij-idea, intellij-plugin, scala]\",\"watchers\":1066},{\"full_name\":\"JetBrains/teamcity-messages\",\"html_url\":\"https://github.com/JetBrains/teamcity-messages\",\"stargazers_count\":125,\"topics\":\"[]\",\"watchers\":125},{\"full_name\":\"JetBrains/teamcity-cpp\",\"html_url\":\"https://github.com/JetBrains/teamcity-cpp\",\"stargazers_count\":27,\"topics\":\"[]\",\"watchers\":27},{\"full_name\":\"JetBrains/kotlin\",\"html_url\":\"https://github.com/JetBrains/kotlin\",\"stargazers_count\":39402,\"topics\":\"[compiler, gradle-plugin, intellij-plugin, kotlin, kotlin-library, maven-plugin, programming-language]\",\"watchers\":39402}]}" }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 11 }, { "metadata": {}, "cell_type": "markdown", "source": "Kotlin Notebook has special interactive outputs for `DataFrame`. Learn more about them [here](https://kotlin.github.io/dataframe/kotlin-dataframe-features-in-kotlin-notebook.html)." }, { "metadata": {}, "cell_type": "markdown", "source": "Use `.describe()` method to get dataset summaries — column types, number of nulls and simple statistics." }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:03.040814Z", "start_time": "2025-05-27T17:21:02.817272Z" } }, "cell_type": "code", "source": "df.describe()", "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
nametypecountuniquenullstopfreqmeanstdminp25medianp75max
full_nameString5625620JetBrains/JPS1nullnullJetBrains/Android-Tuts-SamplesJetBrains/eslint-configJetBrains/lightbeamJetBrains/teamcity-bitbucket-issuesJetBrains/ztools
html_urlURL5625620https://github.com/JetBrains/JPS1nullnullnullnullnullnullnull
stargazers_countInt56216501100244.7597861862.80198202.0000008.00000048.00000039402
topicsString5621450[]401nullnull[2d, graphics, java, skia][][][awt, swing][youtrack, youtrack-workflow]
watchersInt56216501100244.7597861862.80198202.0000008.00000048.00000039402
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"name\",\"type\",\"count\",\"unique\",\"nulls\",\"top\",\"freq\",\"mean\",\"std\",\"min\",\"p25\",\"median\",\"p75\",\"max\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"java.io.Serializable\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Double?\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Double?\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Comparable<*>?\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Comparable<*>?\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Comparable<*>?\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Comparable<*>?\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Comparable<*>?\"}],\"nrow\":5,\"ncol\":14},\"kotlin_dataframe\":[{\"name\":\"full_name\",\"type\":\"String\",\"count\":562,\"unique\":562,\"nulls\":0,\"top\":\"JetBrains/JPS\",\"freq\":1,\"mean\":null,\"std\":null,\"min\":\"JetBrains/Android-Tuts-Samples\",\"p25\":\"JetBrains/eslint-config\",\"median\":\"JetBrains/lightbeam\",\"p75\":\"JetBrains/teamcity-bitbucket-issues\",\"max\":\"JetBrains/ztools\"},{\"name\":\"html_url\",\"type\":\"URL\",\"count\":562,\"unique\":562,\"nulls\":0,\"top\":\"https://github.com/JetBrains/JPS\",\"freq\":1,\"mean\":null,\"std\":null,\"min\":null,\"p25\":null,\"median\":null,\"p75\":null,\"max\":null},{\"name\":\"stargazers_count\",\"type\":\"Int\",\"count\":562,\"unique\":165,\"nulls\":0,\"top\":\"1\",\"freq\":100,\"mean\":244.75978647686833,\"std\":1862.8019819171673,\"min\":\"0\",\"p25\":\"2.0\",\"median\":\"8.0\",\"p75\":\"48.0\",\"max\":\"39402\"},{\"name\":\"topics\",\"type\":\"String\",\"count\":562,\"unique\":145,\"nulls\":0,\"top\":\"[]\",\"freq\":401,\"mean\":null,\"std\":null,\"min\":\"[2d, graphics, java, skia]\",\"p25\":\"[]\",\"median\":\"[]\",\"p75\":\"[awt, swing]\",\"max\":\"[youtrack, youtrack-workflow]\"},{\"name\":\"watchers\",\"type\":\"Int\",\"count\":562,\"unique\":165,\"nulls\":0,\"top\":\"1\",\"freq\":100,\"mean\":244.75978647686833,\"std\":1862.8019819171673,\"min\":\"0\",\"p25\":\"2.0\",\"median\":\"8.0\",\"p75\":\"48.0\",\"max\":\"39402\"}]}" }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 12 }, { "metadata": {}, "cell_type": "markdown", "source": "## Select Columns" }, { "metadata": {}, "cell_type": "markdown", "source": [ "Kotlin DataFrame features a typesafe [Columns Selection DSL](https://kotlin.github.io/dataframe/columnselectors.html), enabling flexible and safe selection of any combination of columns.\n", "Column selectors are widely used across operations — one of the simplest examples is `.select { }`, which returns a new DataFrame with only the columns chosen in [Columns Selection](https://kotlin.github.io/dataframe/columnselectors.html) expression.\n", "\n", "After executing the cell where a `DataFrame` variable is declared, an extension with properties for its columns is automatically generated.\n", "These properties can then be used in the [Columns Selection DSL](https://kotlin.github.io/dataframe/columnselectors.html) expression for typesafe and convenient column access.\n", "\n", "Select some columns:" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:03.489356Z", "start_time": "2025-05-27T17:21:03.044798Z" } }, "cell_type": "code", "source": [ "// Select \"full_name\", \"stargazers_count\" and \"topics\" columns\n", "val dfSelected = df.select { full_name and stargazers_count and topics }\n", "dfSelected" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
full_namestargazers_counttopics
JetBrains/JPS23[]
JetBrains/YouTrackSharp115[jetbrains, jetbrains-youtrack, youtr...
JetBrains/colorSchemeTool290[]
JetBrains/ideavim6120[ideavim, intellij, intellij-platform...
JetBrains/youtrack-vcs-hooks5[]
JetBrains/youtrack-rest-ruby-library8[]
JetBrains/emacs4ij47[]
JetBrains/codereview4intellij11[]
JetBrains/teamcity-nuget-support41[nuget, nuget-feed, teamcity, teamcit...
JetBrains/Grammar-Kit534[]
JetBrains/intellij-starteam-plugin6[]
JetBrains/la-clojure218[]
JetBrains/MPS1241[domain-specific-language, dsl]
JetBrains/intellij-community12926[code-editor, ide, intellij, intellij...
JetBrains/TeamCity.ServiceMessages39[c-sharp, teamcity, teamcity-service-...
JetBrains/youtrack-rest-python-library118[]
JetBrains/intellij-scala1066[intellij-idea, intellij-plugin, scala]
JetBrains/teamcity-messages125[]
JetBrains/teamcity-cpp27[]
JetBrains/kotlin39402[compiler, gradle-plugin, intellij-pl...
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"full_name\",\"stargazers_count\",\"topics\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"}],\"nrow\":562,\"ncol\":3},\"kotlin_dataframe\":[{\"full_name\":\"JetBrains/JPS\",\"stargazers_count\":23,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/YouTrackSharp\",\"stargazers_count\":115,\"topics\":\"[jetbrains, jetbrains-youtrack, youtrack, youtrack-api]\"},{\"full_name\":\"JetBrains/colorSchemeTool\",\"stargazers_count\":290,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/ideavim\",\"stargazers_count\":6120,\"topics\":\"[ideavim, intellij, intellij-platform, jb-official, kotlin, vim, vim-emulator]\"},{\"full_name\":\"JetBrains/youtrack-vcs-hooks\",\"stargazers_count\":5,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/youtrack-rest-ruby-library\",\"stargazers_count\":8,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/emacs4ij\",\"stargazers_count\":47,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/codereview4intellij\",\"stargazers_count\":11,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/teamcity-nuget-support\",\"stargazers_count\":41,\"topics\":\"[nuget, nuget-feed, teamcity, teamcity-plugin]\"},{\"full_name\":\"JetBrains/Grammar-Kit\",\"stargazers_count\":534,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/intellij-starteam-plugin\",\"stargazers_count\":6,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/la-clojure\",\"stargazers_count\":218,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/MPS\",\"stargazers_count\":1241,\"topics\":\"[domain-specific-language, dsl]\"},{\"full_name\":\"JetBrains/intellij-community\",\"stargazers_count\":12926,\"topics\":\"[code-editor, ide, intellij, intellij-community, intellij-platform]\"},{\"full_name\":\"JetBrains/TeamCity.ServiceMessages\",\"stargazers_count\":39,\"topics\":\"[c-sharp, teamcity, teamcity-service-messages]\"},{\"full_name\":\"JetBrains/youtrack-rest-python-library\",\"stargazers_count\":118,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/intellij-scala\",\"stargazers_count\":1066,\"topics\":\"[intellij-idea, intellij-plugin, scala]\"},{\"full_name\":\"JetBrains/teamcity-messages\",\"stargazers_count\":125,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/teamcity-cpp\",\"stargazers_count\":27,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/kotlin\",\"stargazers_count\":39402,\"topics\":\"[compiler, gradle-plugin, intellij-plugin, kotlin, kotlin-library, maven-plugin, programming-language]\"}]}" }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 13 }, { "metadata": {}, "cell_type": "markdown", "source": "## Raw Filtering" }, { "metadata": {}, "cell_type": "markdown", "source": [ "Some operations use `RowExpression`, i.e., an expression that applies for all `DataFrame` rows.\n", "For example `.filter { }` returns a new `DataFrame` with rows that satisfy a condition given by row expression.\n", "\n", "Inside a row expression, you can access the values of the current row by column names through auto-generated properties.\n", "Similar to the [Columns Selection DSL](https://kotlin.github.io/dataframe/columnselectors.html), but in this case the properties represent actual values, not column references.\n", "\n", "Filter rows by \"stargazers_count\" value:" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:03.726083Z", "start_time": "2025-05-27T17:21:03.494323Z" } }, "cell_type": "code", "source": [ "// Keep only rows where \"stargazers_count\" value is more than 1000\n", "val dfFiltered = dfSelected.filter { stargazers_count >= 1000 }\n", "dfFiltered" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
full_namestargazers_counttopics
JetBrains/ideavim6120[ideavim, intellij, intellij-platform...
JetBrains/MPS1241[domain-specific-language, dsl]
JetBrains/intellij-community12926[code-editor, ide, intellij, intellij...
JetBrains/intellij-scala1066[intellij-idea, intellij-plugin, scala]
JetBrains/kotlin39402[compiler, gradle-plugin, intellij-pl...
JetBrains/intellij-plugins1737[]
JetBrains/Exposed5688[dao, kotlin, orm, sql]
JetBrains/kotlin-web-site1074[kotlin]
JetBrains/idea-gitignore1181[gitignore, ignore-files, intellij, i...
JetBrains/swot1072[]
JetBrains/phpstorm-stubs1110[]
JetBrains/gradle-intellij-plugin1058[gradle, gradle-intellij-plugin, grad...
JetBrains/svg-sprite-loader1815[sprite, svg, svg-sprite, svg-stack, ...
JetBrains/resharper-unity1017[hacktoberfest, jetbrains, plugin, re...
JetBrains/kotlin-native7101[c, compiler, kotlin, llvm, objective-c]
JetBrains/create-react-kotlin-app2424[create-react-app, jetbrains-ui, kotl...
JetBrains/ring-ui2836[components, jetbrains-ui, react]
JetBrains/kotlinconf-app2628[]
JetBrains/JetBrainsMono6059[coding-font, font, ligatures, monosp...
JetBrains/intellij-platform-plugin-te...1133[intellij, intellij-idea, intellij-id...
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"full_name\",\"stargazers_count\",\"topics\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"}],\"nrow\":24,\"ncol\":3},\"kotlin_dataframe\":[{\"full_name\":\"JetBrains/ideavim\",\"stargazers_count\":6120,\"topics\":\"[ideavim, intellij, intellij-platform, jb-official, kotlin, vim, vim-emulator]\"},{\"full_name\":\"JetBrains/MPS\",\"stargazers_count\":1241,\"topics\":\"[domain-specific-language, dsl]\"},{\"full_name\":\"JetBrains/intellij-community\",\"stargazers_count\":12926,\"topics\":\"[code-editor, ide, intellij, intellij-community, intellij-platform]\"},{\"full_name\":\"JetBrains/intellij-scala\",\"stargazers_count\":1066,\"topics\":\"[intellij-idea, intellij-plugin, scala]\"},{\"full_name\":\"JetBrains/kotlin\",\"stargazers_count\":39402,\"topics\":\"[compiler, gradle-plugin, intellij-plugin, kotlin, kotlin-library, maven-plugin, programming-language]\"},{\"full_name\":\"JetBrains/intellij-plugins\",\"stargazers_count\":1737,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/Exposed\",\"stargazers_count\":5688,\"topics\":\"[dao, kotlin, orm, sql]\"},{\"full_name\":\"JetBrains/kotlin-web-site\",\"stargazers_count\":1074,\"topics\":\"[kotlin]\"},{\"full_name\":\"JetBrains/idea-gitignore\",\"stargazers_count\":1181,\"topics\":\"[gitignore, ignore-files, intellij, intellij-plugin, java]\"},{\"full_name\":\"JetBrains/swot\",\"stargazers_count\":1072,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/phpstorm-stubs\",\"stargazers_count\":1110,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/gradle-intellij-plugin\",\"stargazers_count\":1058,\"topics\":\"[gradle, gradle-intellij-plugin, gradle-kotlin-dsl, groovy, intellij, intellij-ides, intellij-platform, intellij-plugin, intellij-sdk, jetbrains-plugin, kotlin, plugin-verifier, publishing-dsl, setup-dsl, teamcity, travis-configuration]\"},{\"full_name\":\"JetBrains/svg-sprite-loader\",\"stargazers_count\":1815,\"topics\":\"[sprite, svg, svg-sprite, svg-stack, webpack, webpack-loader, webpack-plugin, webpack2, webpack3]\"},{\"full_name\":\"JetBrains/resharper-unity\",\"stargazers_count\":1017,\"topics\":\"[hacktoberfest, jetbrains, plugin, resharper, resharper-plugin, rider, unity, unity-editor]\"},{\"full_name\":\"JetBrains/kotlin-native\",\"stargazers_count\":7101,\"topics\":\"[c, compiler, kotlin, llvm, objective-c]\"},{\"full_name\":\"JetBrains/create-react-kotlin-app\",\"stargazers_count\":2424,\"topics\":\"[create-react-app, jetbrains-ui, kotlin, react, webpack]\"},{\"full_name\":\"JetBrains/ring-ui\",\"stargazers_count\":2836,\"topics\":\"[components, jetbrains-ui, react]\"},{\"full_name\":\"JetBrains/kotlinconf-app\",\"stargazers_count\":2628,\"topics\":\"[]\"},{\"full_name\":\"JetBrains/JetBrainsMono\",\"stargazers_count\":6059,\"topics\":\"[coding-font, font, ligatures, monospaced-font, programming-font, programming-ligatures]\"},{\"full_name\":\"JetBrains/intellij-platform-plugin-template\",\"stargazers_count\":1133,\"topics\":\"[intellij, intellij-idea, intellij-idea-plugin, intellij-platform, intellij-plugin, intellij-plugins, jetbrains-plugin]\"}]}" }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 14 }, { "metadata": {}, "cell_type": "markdown", "source": "## Columns Rename" }, { "metadata": {}, "cell_type": "markdown", "source": [ "Columns can be renamed using the `.rename { }` operation, which also uses the [Columns Selection DSL](https://kotlin.github.io/dataframe/columnselectors.html) to select a column to rename.\n", "The `rename` operation does not perform the renaming immediately; instead, it creates an intermediate object that must be finalized into a new `DataFrame` by calling the `.into()` function with the new column name.\n", "\n", "Rename \"full_name\" and \"stargazers_count\" columns:" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:04.133144Z", "start_time": "2025-05-27T17:21:03.730392Z" } }, "cell_type": "code", "source": [ "// Rename \"full_name\" column into \"name\"\n", "val dfRenamed = dfFiltered\n", " .rename { full_name }.into(\"name\")\n", " // And \"stargazers_count\" into \"starsCount\"\n", " .rename { stargazers_count }.into(\"starsCount\")\n", "dfRenamed" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
namestarsCounttopics
JetBrains/ideavim6120[ideavim, intellij, intellij-platform...
JetBrains/MPS1241[domain-specific-language, dsl]
JetBrains/intellij-community12926[code-editor, ide, intellij, intellij...
JetBrains/intellij-scala1066[intellij-idea, intellij-plugin, scala]
JetBrains/kotlin39402[compiler, gradle-plugin, intellij-pl...
JetBrains/intellij-plugins1737[]
JetBrains/Exposed5688[dao, kotlin, orm, sql]
JetBrains/kotlin-web-site1074[kotlin]
JetBrains/idea-gitignore1181[gitignore, ignore-files, intellij, i...
JetBrains/swot1072[]
JetBrains/phpstorm-stubs1110[]
JetBrains/gradle-intellij-plugin1058[gradle, gradle-intellij-plugin, grad...
JetBrains/svg-sprite-loader1815[sprite, svg, svg-sprite, svg-stack, ...
JetBrains/resharper-unity1017[hacktoberfest, jetbrains, plugin, re...
JetBrains/kotlin-native7101[c, compiler, kotlin, llvm, objective-c]
JetBrains/create-react-kotlin-app2424[create-react-app, jetbrains-ui, kotl...
JetBrains/ring-ui2836[components, jetbrains-ui, react]
JetBrains/kotlinconf-app2628[]
JetBrains/JetBrainsMono6059[coding-font, font, ligatures, monosp...
JetBrains/intellij-platform-plugin-te...1133[intellij, intellij-idea, intellij-id...
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"name\",\"starsCount\",\"topics\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"}],\"nrow\":24,\"ncol\":3},\"kotlin_dataframe\":[{\"name\":\"JetBrains/ideavim\",\"starsCount\":6120,\"topics\":\"[ideavim, intellij, intellij-platform, jb-official, kotlin, vim, vim-emulator]\"},{\"name\":\"JetBrains/MPS\",\"starsCount\":1241,\"topics\":\"[domain-specific-language, dsl]\"},{\"name\":\"JetBrains/intellij-community\",\"starsCount\":12926,\"topics\":\"[code-editor, ide, intellij, intellij-community, intellij-platform]\"},{\"name\":\"JetBrains/intellij-scala\",\"starsCount\":1066,\"topics\":\"[intellij-idea, intellij-plugin, scala]\"},{\"name\":\"JetBrains/kotlin\",\"starsCount\":39402,\"topics\":\"[compiler, gradle-plugin, intellij-plugin, kotlin, kotlin-library, maven-plugin, programming-language]\"},{\"name\":\"JetBrains/intellij-plugins\",\"starsCount\":1737,\"topics\":\"[]\"},{\"name\":\"JetBrains/Exposed\",\"starsCount\":5688,\"topics\":\"[dao, kotlin, orm, sql]\"},{\"name\":\"JetBrains/kotlin-web-site\",\"starsCount\":1074,\"topics\":\"[kotlin]\"},{\"name\":\"JetBrains/idea-gitignore\",\"starsCount\":1181,\"topics\":\"[gitignore, ignore-files, intellij, intellij-plugin, java]\"},{\"name\":\"JetBrains/swot\",\"starsCount\":1072,\"topics\":\"[]\"},{\"name\":\"JetBrains/phpstorm-stubs\",\"starsCount\":1110,\"topics\":\"[]\"},{\"name\":\"JetBrains/gradle-intellij-plugin\",\"starsCount\":1058,\"topics\":\"[gradle, gradle-intellij-plugin, gradle-kotlin-dsl, groovy, intellij, intellij-ides, intellij-platform, intellij-plugin, intellij-sdk, jetbrains-plugin, kotlin, plugin-verifier, publishing-dsl, setup-dsl, teamcity, travis-configuration]\"},{\"name\":\"JetBrains/svg-sprite-loader\",\"starsCount\":1815,\"topics\":\"[sprite, svg, svg-sprite, svg-stack, webpack, webpack-loader, webpack-plugin, webpack2, webpack3]\"},{\"name\":\"JetBrains/resharper-unity\",\"starsCount\":1017,\"topics\":\"[hacktoberfest, jetbrains, plugin, resharper, resharper-plugin, rider, unity, unity-editor]\"},{\"name\":\"JetBrains/kotlin-native\",\"starsCount\":7101,\"topics\":\"[c, compiler, kotlin, llvm, objective-c]\"},{\"name\":\"JetBrains/create-react-kotlin-app\",\"starsCount\":2424,\"topics\":\"[create-react-app, jetbrains-ui, kotlin, react, webpack]\"},{\"name\":\"JetBrains/ring-ui\",\"starsCount\":2836,\"topics\":\"[components, jetbrains-ui, react]\"},{\"name\":\"JetBrains/kotlinconf-app\",\"starsCount\":2628,\"topics\":\"[]\"},{\"name\":\"JetBrains/JetBrainsMono\",\"starsCount\":6059,\"topics\":\"[coding-font, font, ligatures, monospaced-font, programming-font, programming-ligatures]\"},{\"name\":\"JetBrains/intellij-platform-plugin-template\",\"starsCount\":1133,\"topics\":\"[intellij, intellij-idea, intellij-idea-plugin, intellij-platform, intellij-plugin, intellij-plugins, jetbrains-plugin]\"}]}" }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 15 }, { "metadata": {}, "cell_type": "markdown", "source": "## Modify Columns" }, { "metadata": {}, "cell_type": "markdown", "source": [ "Columns can be modified using the `update { }` and `convert { }` operations.\n", "Both operations select columns to modify via the [Columns Selection DSL](https://kotlin.github.io/dataframe/columnselectors.html) and, similar to `rename`, create an intermediate object that must be finalized to produce a new `DataFrame`.\n", "\n", "The `update` operation preserves the original column types, while `convert` allows changing the type.\n", "In both cases, column names and their positions remain unchanged.\n", "\n", "Update \"name\" and convert \"topics\":" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:24:22.730669Z", "start_time": "2025-05-27T17:24:22.234699Z" } }, "cell_type": "code", "source": [ "val dfUpdated = dfRenamed\n", " // Update \"name\" values with only its second part (after '/')\n", " .update { name }.with { it.split(\"/\")[1] }\n", " // Convert \"topics\" `String` values into `List` by splitting:\n", " .convert { topics }.with { it.removeSurrounding(\"[\", \"]\").split(\", \") }\n", "dfUpdated" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
namestarsCounttopics
ideavim6120[ideavim, intellij, intellij-platform...
MPS1241[domain-specific-language, dsl]
intellij-community12926[code-editor, ide, intellij, intellij...
intellij-scala1066[intellij-idea, intellij-plugin, scala]
kotlin39402[compiler, gradle-plugin, intellij-pl...
intellij-plugins1737[]
Exposed5688[dao, kotlin, orm, sql]
kotlin-web-site1074[kotlin]
idea-gitignore1181[gitignore, ignore-files, intellij, i...
swot1072[]
phpstorm-stubs1110[]
gradle-intellij-plugin1058[gradle, gradle-intellij-plugin, grad...
svg-sprite-loader1815[sprite, svg, svg-sprite, svg-stack, ...
resharper-unity1017[hacktoberfest, jetbrains, plugin, re...
kotlin-native7101[c, compiler, kotlin, llvm, objective-c]
create-react-kotlin-app2424[create-react-app, jetbrains-ui, kotl...
ring-ui2836[components, jetbrains-ui, react]
kotlinconf-app2628[]
JetBrainsMono6059[coding-font, font, ligatures, monosp...
intellij-platform-plugin-template1133[intellij, intellij-idea, intellij-id...
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"name\",\"starsCount\",\"topics\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.collections.List\"}],\"nrow\":24,\"ncol\":3},\"kotlin_dataframe\":[{\"name\":\"ideavim\",\"starsCount\":6120,\"topics\":[\"ideavim\",\"intellij\",\"intellij-platform\",\"jb-official\",\"kotlin\",\"vim\",\"vim-emulator\"]},{\"name\":\"MPS\",\"starsCount\":1241,\"topics\":[\"domain-specific-language\",\"dsl\"]},{\"name\":\"intellij-community\",\"starsCount\":12926,\"topics\":[\"code-editor\",\"ide\",\"intellij\",\"intellij-community\",\"intellij-platform\"]},{\"name\":\"intellij-scala\",\"starsCount\":1066,\"topics\":[\"intellij-idea\",\"intellij-plugin\",\"scala\"]},{\"name\":\"kotlin\",\"starsCount\":39402,\"topics\":[\"compiler\",\"gradle-plugin\",\"intellij-plugin\",\"kotlin\",\"kotlin-library\",\"maven-plugin\",\"programming-language\"]},{\"name\":\"intellij-plugins\",\"starsCount\":1737,\"topics\":[\"\"]},{\"name\":\"Exposed\",\"starsCount\":5688,\"topics\":[\"dao\",\"kotlin\",\"orm\",\"sql\"]},{\"name\":\"kotlin-web-site\",\"starsCount\":1074,\"topics\":[\"kotlin\"]},{\"name\":\"idea-gitignore\",\"starsCount\":1181,\"topics\":[\"gitignore\",\"ignore-files\",\"intellij\",\"intellij-plugin\",\"java\"]},{\"name\":\"swot\",\"starsCount\":1072,\"topics\":[\"\"]},{\"name\":\"phpstorm-stubs\",\"starsCount\":1110,\"topics\":[\"\"]},{\"name\":\"gradle-intellij-plugin\",\"starsCount\":1058,\"topics\":[\"gradle\",\"gradle-intellij-plugin\",\"gradle-kotlin-dsl\",\"groovy\",\"intellij\",\"intellij-ides\",\"intellij-platform\",\"intellij-plugin\",\"intellij-sdk\",\"jetbrains-plugin\",\"kotlin\",\"plugin-verifier\",\"publishing-dsl\",\"setup-dsl\",\"teamcity\",\"travis-configuration\"]},{\"name\":\"svg-sprite-loader\",\"starsCount\":1815,\"topics\":[\"sprite\",\"svg\",\"svg-sprite\",\"svg-stack\",\"webpack\",\"webpack-loader\",\"webpack-plugin\",\"webpack2\",\"webpack3\"]},{\"name\":\"resharper-unity\",\"starsCount\":1017,\"topics\":[\"hacktoberfest\",\"jetbrains\",\"plugin\",\"resharper\",\"resharper-plugin\",\"rider\",\"unity\",\"unity-editor\"]},{\"name\":\"kotlin-native\",\"starsCount\":7101,\"topics\":[\"c\",\"compiler\",\"kotlin\",\"llvm\",\"objective-c\"]},{\"name\":\"create-react-kotlin-app\",\"starsCount\":2424,\"topics\":[\"create-react-app\",\"jetbrains-ui\",\"kotlin\",\"react\",\"webpack\"]},{\"name\":\"ring-ui\",\"starsCount\":2836,\"topics\":[\"components\",\"jetbrains-ui\",\"react\"]},{\"name\":\"kotlinconf-app\",\"starsCount\":2628,\"topics\":[\"\"]},{\"name\":\"JetBrainsMono\",\"starsCount\":6059,\"topics\":[\"coding-font\",\"font\",\"ligatures\",\"monospaced-font\",\"programming-font\",\"programming-ligatures\"]},{\"name\":\"intellij-platform-plugin-template\",\"starsCount\":1133,\"topics\":[\"intellij\",\"intellij-idea\",\"intellij-idea-plugin\",\"intellij-platform\",\"intellij-plugin\",\"intellij-plugins\",\"jetbrains-plugin\"]}]}" }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 26 }, { "metadata": {}, "cell_type": "markdown", "source": "Check the new \"topics\" type out:" }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:24:39.779771Z", "start_time": "2025-05-27T17:24:39.722836Z" } }, "cell_type": "code", "source": "dfUpdated.topics.type()", "outputs": [ { "data": { "text/plain": [ "kotlin.collections.List" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 27 }, { "metadata": {}, "cell_type": "markdown", "source": "## Adding New Columns" }, { "metadata": {}, "cell_type": "markdown", "source": [ "The `.add { }` function allows creating a `DataFrame` with a new column, where the value for each row is computed based on the existing values in that row. These values can be accessed within the row expressions.\n", "\n", "Add a new `Boolean` column \"isIntellij\":" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:25:02.027792Z", "start_time": "2025-05-27T17:25:01.788709Z" } }, "cell_type": "code", "source": [ "// Add a `Boolean` column indicating whether the `name` contains the \"intellij\" substring\n", "// or the topics include \"intellij\".\n", "val dfWithIsIntellij = dfUpdated.add(\"isIntellij\") {\n", " name.contains(\"intellij\") || \"intellij\" in topics\n", "}\n", "dfWithIsIntellij" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
namestarsCounttopicsisIntellij
ideavim6120[ideavim, intellij, intellij-platform...true
MPS1241[domain-specific-language, dsl]false
intellij-community12926[code-editor, ide, intellij, intellij...true
intellij-scala1066[intellij-idea, intellij-plugin, scala]true
kotlin39402[compiler, gradle-plugin, intellij-pl...false
intellij-plugins1737[]true
Exposed5688[dao, kotlin, orm, sql]false
kotlin-web-site1074[kotlin]false
idea-gitignore1181[gitignore, ignore-files, intellij, i...true
swot1072[]false
phpstorm-stubs1110[]false
gradle-intellij-plugin1058[gradle, gradle-intellij-plugin, grad...true
svg-sprite-loader1815[sprite, svg, svg-sprite, svg-stack, ...false
resharper-unity1017[hacktoberfest, jetbrains, plugin, re...false
kotlin-native7101[c, compiler, kotlin, llvm, objective-c]false
create-react-kotlin-app2424[create-react-app, jetbrains-ui, kotl...false
ring-ui2836[components, jetbrains-ui, react]false
kotlinconf-app2628[]false
JetBrainsMono6059[coding-font, font, ligatures, monosp...false
intellij-platform-plugin-template1133[intellij, intellij-idea, intellij-id...true
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"name\",\"starsCount\",\"topics\",\"isIntellij\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.collections.List\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Boolean\"}],\"nrow\":24,\"ncol\":4},\"kotlin_dataframe\":[{\"name\":\"ideavim\",\"starsCount\":6120,\"topics\":[\"ideavim\",\"intellij\",\"intellij-platform\",\"jb-official\",\"kotlin\",\"vim\",\"vim-emulator\"],\"isIntellij\":true},{\"name\":\"MPS\",\"starsCount\":1241,\"topics\":[\"domain-specific-language\",\"dsl\"],\"isIntellij\":false},{\"name\":\"intellij-community\",\"starsCount\":12926,\"topics\":[\"code-editor\",\"ide\",\"intellij\",\"intellij-community\",\"intellij-platform\"],\"isIntellij\":true},{\"name\":\"intellij-scala\",\"starsCount\":1066,\"topics\":[\"intellij-idea\",\"intellij-plugin\",\"scala\"],\"isIntellij\":true},{\"name\":\"kotlin\",\"starsCount\":39402,\"topics\":[\"compiler\",\"gradle-plugin\",\"intellij-plugin\",\"kotlin\",\"kotlin-library\",\"maven-plugin\",\"programming-language\"],\"isIntellij\":false},{\"name\":\"intellij-plugins\",\"starsCount\":1737,\"topics\":[\"\"],\"isIntellij\":true},{\"name\":\"Exposed\",\"starsCount\":5688,\"topics\":[\"dao\",\"kotlin\",\"orm\",\"sql\"],\"isIntellij\":false},{\"name\":\"kotlin-web-site\",\"starsCount\":1074,\"topics\":[\"kotlin\"],\"isIntellij\":false},{\"name\":\"idea-gitignore\",\"starsCount\":1181,\"topics\":[\"gitignore\",\"ignore-files\",\"intellij\",\"intellij-plugin\",\"java\"],\"isIntellij\":true},{\"name\":\"swot\",\"starsCount\":1072,\"topics\":[\"\"],\"isIntellij\":false},{\"name\":\"phpstorm-stubs\",\"starsCount\":1110,\"topics\":[\"\"],\"isIntellij\":false},{\"name\":\"gradle-intellij-plugin\",\"starsCount\":1058,\"topics\":[\"gradle\",\"gradle-intellij-plugin\",\"gradle-kotlin-dsl\",\"groovy\",\"intellij\",\"intellij-ides\",\"intellij-platform\",\"intellij-plugin\",\"intellij-sdk\",\"jetbrains-plugin\",\"kotlin\",\"plugin-verifier\",\"publishing-dsl\",\"setup-dsl\",\"teamcity\",\"travis-configuration\"],\"isIntellij\":true},{\"name\":\"svg-sprite-loader\",\"starsCount\":1815,\"topics\":[\"sprite\",\"svg\",\"svg-sprite\",\"svg-stack\",\"webpack\",\"webpack-loader\",\"webpack-plugin\",\"webpack2\",\"webpack3\"],\"isIntellij\":false},{\"name\":\"resharper-unity\",\"starsCount\":1017,\"topics\":[\"hacktoberfest\",\"jetbrains\",\"plugin\",\"resharper\",\"resharper-plugin\",\"rider\",\"unity\",\"unity-editor\"],\"isIntellij\":false},{\"name\":\"kotlin-native\",\"starsCount\":7101,\"topics\":[\"c\",\"compiler\",\"kotlin\",\"llvm\",\"objective-c\"],\"isIntellij\":false},{\"name\":\"create-react-kotlin-app\",\"starsCount\":2424,\"topics\":[\"create-react-app\",\"jetbrains-ui\",\"kotlin\",\"react\",\"webpack\"],\"isIntellij\":false},{\"name\":\"ring-ui\",\"starsCount\":2836,\"topics\":[\"components\",\"jetbrains-ui\",\"react\"],\"isIntellij\":false},{\"name\":\"kotlinconf-app\",\"starsCount\":2628,\"topics\":[\"\"],\"isIntellij\":false},{\"name\":\"JetBrainsMono\",\"starsCount\":6059,\"topics\":[\"coding-font\",\"font\",\"ligatures\",\"monospaced-font\",\"programming-font\",\"programming-ligatures\"],\"isIntellij\":false},{\"name\":\"intellij-platform-plugin-template\",\"starsCount\":1133,\"topics\":[\"intellij\",\"intellij-idea\",\"intellij-idea-plugin\",\"intellij-platform\",\"intellij-plugin\",\"intellij-plugins\",\"jetbrains-plugin\"],\"isIntellij\":true}]}" }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 28 }, { "metadata": {}, "cell_type": "markdown", "source": "## Grouping And Aggregating" }, { "metadata": {}, "cell_type": "markdown", "source": [ "A `DataFrame` can be grouped by column keys, meaning its rows are split into groups based on the values in the key columns.\n", "The `.groupBy { }` operation selects columns and groups the `DataFrame` by their values, using them as grouping keys.\n", "\n", "The result is a `GroupBy` — a `DataFrame`-like structure that associates each key with the corresponding subset of the original `DataFrame`.\n", "\n", "Group `dfWithIsIntellij` by \"isIntellij\":" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:05.957677Z", "start_time": "2025-05-27T17:21:05.783597Z" } }, "cell_type": "code", "source": [ "val groupedByIsIntellij = dfWithIsIntellij.groupBy { isIntellij }\n", "groupedByIsIntellij" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
isIntellijgroup
true
DataFrame [7 x 4]
namestarsCounttopicsisIntellij
ideavim6120[ideavim, intellij, intellij-platform...true
intellij-community12926[code-editor, ide, intellij, intellij...true
intellij-scala1066[intellij-idea, intellij-plugin, scala]true
intellij-plugins1737[]true
idea-gitignore1181[gitignore, ignore-files, intellij, i...true

... showing only top 5 of 7 rows

false
DataFrame [17 x 4]
namestarsCounttopicsisIntellij
MPS1241[domain-specific-language, dsl]false
kotlin39402[compiler, gradle-plugin, intellij-pl...false
Exposed5688[dao, kotlin, orm, sql]false
kotlin-web-site1074[kotlin]false
swot1072[]false

... showing only top 5 of 17 rows

\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"isIntellij\",\"group\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Boolean\"},{\"kind\":\"FrameColumn\"}],\"nrow\":2,\"ncol\":2},\"kotlin_dataframe\":[{\"isIntellij\":true,\"group\":{\"data\":[{\"name\":\"ideavim\",\"starsCount\":6120,\"topics\":[\"ideavim\",\"intellij\",\"intellij-platform\",\"jb-official\",\"kotlin\",\"vim\",\"vim-emulator\"],\"isIntellij\":true},{\"name\":\"intellij-community\",\"starsCount\":12926,\"topics\":[\"code-editor\",\"ide\",\"intellij\",\"intellij-community\",\"intellij-platform\"],\"isIntellij\":true},{\"name\":\"intellij-scala\",\"starsCount\":1066,\"topics\":[\"intellij-idea\",\"intellij-plugin\",\"scala\"],\"isIntellij\":true},{\"name\":\"intellij-plugins\",\"starsCount\":1737,\"topics\":[\"\"],\"isIntellij\":true},{\"name\":\"idea-gitignore\",\"starsCount\":1181,\"topics\":[\"gitignore\",\"ignore-files\",\"intellij\",\"intellij-plugin\",\"java\"],\"isIntellij\":true},{\"name\":\"gradle-intellij-plugin\",\"starsCount\":1058,\"topics\":[\"gradle\",\"gradle-intellij-plugin\",\"gradle-kotlin-dsl\",\"groovy\",\"intellij\",\"intellij-ides\",\"intellij-platform\",\"intellij-plugin\",\"intellij-sdk\",\"jetbrains-plugin\",\"kotlin\",\"plugin-verifier\",\"publishing-dsl\",\"setup-dsl\",\"teamcity\",\"travis-configuration\"],\"isIntellij\":true},{\"name\":\"intellij-platform-plugin-template\",\"starsCount\":1133,\"topics\":[\"intellij\",\"intellij-idea\",\"intellij-idea-plugin\",\"intellij-platform\",\"intellij-plugin\",\"intellij-plugins\",\"jetbrains-plugin\"],\"isIntellij\":true}],\"metadata\":{\"kind\":\"FrameColumn\",\"columns\":[\"name\",\"starsCount\",\"topics\",\"isIntellij\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.collections.List\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Boolean\"}],\"ncol\":4,\"nrow\":7}}},{\"isIntellij\":false,\"group\":{\"data\":[{\"name\":\"MPS\",\"starsCount\":1241,\"topics\":[\"domain-specific-language\",\"dsl\"],\"isIntellij\":false},{\"name\":\"kotlin\",\"starsCount\":39402,\"topics\":[\"compiler\",\"gradle-plugin\",\"intellij-plugin\",\"kotlin\",\"kotlin-library\",\"maven-plugin\",\"programming-language\"],\"isIntellij\":false},{\"name\":\"Exposed\",\"starsCount\":5688,\"topics\":[\"dao\",\"kotlin\",\"orm\",\"sql\"],\"isIntellij\":false},{\"name\":\"kotlin-web-site\",\"starsCount\":1074,\"topics\":[\"kotlin\"],\"isIntellij\":false},{\"name\":\"swot\",\"starsCount\":1072,\"topics\":[\"\"],\"isIntellij\":false},{\"name\":\"phpstorm-stubs\",\"starsCount\":1110,\"topics\":[\"\"],\"isIntellij\":false},{\"name\":\"svg-sprite-loader\",\"starsCount\":1815,\"topics\":[\"sprite\",\"svg\",\"svg-sprite\",\"svg-stack\",\"webpack\",\"webpack-loader\",\"webpack-plugin\",\"webpack2\",\"webpack3\"],\"isIntellij\":false},{\"name\":\"resharper-unity\",\"starsCount\":1017,\"topics\":[\"hacktoberfest\",\"jetbrains\",\"plugin\",\"resharper\",\"resharper-plugin\",\"rider\",\"unity\",\"unity-editor\"],\"isIntellij\":false},{\"name\":\"kotlin-native\",\"starsCount\":7101,\"topics\":[\"c\",\"compiler\",\"kotlin\",\"llvm\",\"objective-c\"],\"isIntellij\":false},{\"name\":\"create-react-kotlin-app\",\"starsCount\":2424,\"topics\":[\"create-react-app\",\"jetbrains-ui\",\"kotlin\",\"react\",\"webpack\"],\"isIntellij\":false},{\"name\":\"ring-ui\",\"starsCount\":2836,\"topics\":[\"components\",\"jetbrains-ui\",\"react\"],\"isIntellij\":false},{\"name\":\"kotlinconf-app\",\"starsCount\":2628,\"topics\":[\"\"],\"isIntellij\":false},{\"name\":\"JetBrainsMono\",\"starsCount\":6059,\"topics\":[\"coding-font\",\"font\",\"ligatures\",\"monospaced-font\",\"programming-font\",\"programming-ligatures\"],\"isIntellij\":false},{\"name\":\"skija\",\"starsCount\":2242,\"topics\":[\"2d\",\"graphics\",\"java\",\"skia\"],\"isIntellij\":false},{\"name\":\"projector-docker\",\"starsCount\":1853,\"topics\":[\"awt\",\"docker\",\"swing\"],\"isIntellij\":false},{\"name\":\"projector-server\",\"starsCount\":1025,\"topics\":[\"awt\",\"swing\"],\"isIntellij\":false},{\"name\":\"compose-jb\",\"starsCount\":6805,\"topics\":[\"android\",\"awt\",\"compose\",\"declarative-ui\",\"desktop\",\"gui\",\"javascript\",\"kotlin\",\"multiplatform\",\"reactive\",\"swing\",\"ui\"],\"isIntellij\":false}],\"metadata\":{\"kind\":\"FrameColumn\",\"columns\":[\"name\",\"starsCount\",\"topics\",\"isIntellij\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.collections.List\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Boolean\"}],\"ncol\":4,\"nrow\":17}}}]}" }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 19 }, { "metadata": {}, "cell_type": "markdown", "source": [ "A `GroupBy` can be aggregated — that is, you can compute one or several summary statistics for each group.\n", "The result of the aggregation is a `DataFrame` containing the key columns along with new columns holding the computed statistics for a corresponding group.\n", "\n", "For example, `count()` computes size of a group:" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:06.108708Z", "start_time": "2025-05-27T17:21:05.960300Z" } }, "cell_type": "code", "source": "groupedByIsIntellij.count()", "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
isIntellijcount
true7
false17
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"isIntellij\",\"count\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Boolean\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"}],\"nrow\":2,\"ncol\":2},\"kotlin_dataframe\":[{\"isIntellij\":true,\"count\":7},{\"isIntellij\":false,\"count\":17}]}" }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 20 }, { "metadata": {}, "cell_type": "markdown", "source": "Compute several statistics with `.aggregate { }`, which provides a DSL for aggregating:" }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:06.459635Z", "start_time": "2025-05-27T17:21:06.111467Z" } }, "cell_type": "code", "source": [ "groupedByIsIntellij.aggregate {\n", " // Compute sum and max of \"starsCount\" within each group into \"sumStars\" and \"maxStars\" columns\n", " sumOf { starsCount } into \"sumStars\"\n", " maxOf { starsCount } into \"maxStars\"\n", "}" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
isIntellijsumStarsmaxStars
true2522112926
false8539239402
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"isIntellij\",\"sumStars\",\"maxStars\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Boolean\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"}],\"nrow\":2,\"ncol\":3},\"kotlin_dataframe\":[{\"isIntellij\":true,\"sumStars\":25221,\"maxStars\":12926},{\"isIntellij\":false,\"sumStars\":85392,\"maxStars\":39402}]}" }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 21 }, { "metadata": {}, "cell_type": "markdown", "source": "## Sorting Rows" }, { "metadata": {}, "cell_type": "markdown", "source": [ "`.sort {}`/`.sortByDesc` sorts rows by value in selected columns, returning a DataFrame.\n", "\n", "`take(n)` returns a new `DataFrame` with the first `n` rows.\n", "\n", "Combine them to get Top-10 repositories by number of stars:" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:06.681243Z", "start_time": "2025-05-27T17:21:06.468520Z" } }, "cell_type": "code", "source": [ "val dfTop10 = dfWithIsIntellij\n", " // Sort by \"starsCount\" value descending\n", " .sortByDesc { starsCount }\n", " .take(10)\n", "dfTop10" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", "
namestarsCounttopicsisIntellij
kotlin39402[compiler, gradle-plugin, intellij-pl...false
intellij-community12926[code-editor, ide, intellij, intellij...true
kotlin-native7101[c, compiler, kotlin, llvm, objective-c]false
compose-jb6805[android, awt, compose, declarative-u...false
ideavim6120[ideavim, intellij, intellij-platform...true
JetBrainsMono6059[coding-font, font, ligatures, monosp...false
Exposed5688[dao, kotlin, orm, sql]false
ring-ui2836[components, jetbrains-ui, react]false
kotlinconf-app2628[]false
create-react-kotlin-app2424[create-react-app, jetbrains-ui, kotl...false
\n", " \n", " \n", " " ], "application/kotlindataframe+json": "{\"$version\":\"2.1.1\",\"metadata\":{\"columns\":[\"name\",\"starsCount\",\"topics\",\"isIntellij\"],\"types\":[{\"kind\":\"ValueColumn\",\"type\":\"kotlin.String\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Int\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.collections.List\"},{\"kind\":\"ValueColumn\",\"type\":\"kotlin.Boolean\"}],\"nrow\":10,\"ncol\":4},\"kotlin_dataframe\":[{\"name\":\"kotlin\",\"starsCount\":39402,\"topics\":[\"compiler\",\"gradle-plugin\",\"intellij-plugin\",\"kotlin\",\"kotlin-library\",\"maven-plugin\",\"programming-language\"],\"isIntellij\":false},{\"name\":\"intellij-community\",\"starsCount\":12926,\"topics\":[\"code-editor\",\"ide\",\"intellij\",\"intellij-community\",\"intellij-platform\"],\"isIntellij\":true},{\"name\":\"kotlin-native\",\"starsCount\":7101,\"topics\":[\"c\",\"compiler\",\"kotlin\",\"llvm\",\"objective-c\"],\"isIntellij\":false},{\"name\":\"compose-jb\",\"starsCount\":6805,\"topics\":[\"android\",\"awt\",\"compose\",\"declarative-ui\",\"desktop\",\"gui\",\"javascript\",\"kotlin\",\"multiplatform\",\"reactive\",\"swing\",\"ui\"],\"isIntellij\":false},{\"name\":\"ideavim\",\"starsCount\":6120,\"topics\":[\"ideavim\",\"intellij\",\"intellij-platform\",\"jb-official\",\"kotlin\",\"vim\",\"vim-emulator\"],\"isIntellij\":true},{\"name\":\"JetBrainsMono\",\"starsCount\":6059,\"topics\":[\"coding-font\",\"font\",\"ligatures\",\"monospaced-font\",\"programming-font\",\"programming-ligatures\"],\"isIntellij\":false},{\"name\":\"Exposed\",\"starsCount\":5688,\"topics\":[\"dao\",\"kotlin\",\"orm\",\"sql\"],\"isIntellij\":false},{\"name\":\"ring-ui\",\"starsCount\":2836,\"topics\":[\"components\",\"jetbrains-ui\",\"react\"],\"isIntellij\":false},{\"name\":\"kotlinconf-app\",\"starsCount\":2628,\"topics\":[\"\"],\"isIntellij\":false},{\"name\":\"create-react-kotlin-app\",\"starsCount\":2424,\"topics\":[\"create-react-app\",\"jetbrains-ui\",\"kotlin\",\"react\",\"webpack\"],\"isIntellij\":false}]}" }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 22 }, { "metadata": {}, "cell_type": "markdown", "source": "## Plotting With Kandy" }, { "metadata": {}, "cell_type": "markdown", "source": [ "Kandy is a Kotlin plotting library designed to bring Kotlin DataFrame features into chart creation, providing a convenient and typesafe way to build data visualizations.\n", "\n", "Kandy can be loaded into notebook using `%use kandy`:" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:09.706774Z", "start_time": "2025-05-27T17:21:06.684707Z" } }, "cell_type": "code", "source": "%use kandy@kc25", "outputs": [], "execution_count": 23 }, { "metadata": {}, "cell_type": "markdown", "source": "Build a simple bar chart with the `.plot { }` extension for DataFrame, that allows to use extension properties inside Kandy plotting DSL (a plot will be rendered as output after cell execution):" }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:11.687983Z", "start_time": "2025-05-27T17:21:09.711112Z" } }, "cell_type": "code", "source": [ "dfTop10.plot {\n", " bars {\n", " x(name)\n", " y(starsCount)\n", " }\n", "\n", " layout.title = \"Top 10 JetBrains repositories by stars count\"\n", "}" ], "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " kotlin\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " intellij-community\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " kotlin-native\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " compose-jb\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " ideavim\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " JetBrainsMono\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Exposed\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " ring-ui\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " kotlinconf-app\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " create-react-kotlin-app\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " 0\n", " \n", " \n", " \n", " \n", " \n", " \n", " 10,000\n", " \n", " \n", " \n", " \n", " \n", " \n", " 20,000\n", " \n", " \n", " \n", " \n", " \n", " \n", " 30,000\n", " \n", " \n", " \n", " \n", " \n", " \n", " 40,000\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Top 10 JetBrains repositories by stars count\n", " \n", " \n", " \n", " \n", " starsCount\n", " \n", " \n", " \n", " \n", " name\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " " ], "application/plot+json": { "output_type": "lets_plot_spec", "output": { "ggtitle": { "text": "Top 10 JetBrains repositories by stars count" }, "mapping": {}, "data": { "starsCount": [ 39402.0, 12926.0, 7101.0, 6805.0, 6120.0, 6059.0, 5688.0, 2836.0, 2628.0, 2424.0 ], "name": [ "kotlin", "intellij-community", "kotlin-native", "compose-jb", "ideavim", "JetBrainsMono", "Exposed", "ring-ui", "kotlinconf-app", "create-react-kotlin-app" ] }, "kind": "plot", "scales": [ { "aesthetic": "x", "discrete": true }, { "aesthetic": "y", "limits": [ null, null ] } ], "layers": [ { "mapping": { "x": "name", "y": "starsCount" }, "stat": "identity", "sampling": "none", "inherit_aes": false, "position": "dodge", "geom": "bar" } ], "data_meta": { "series_annotations": [ { "type": "str", "column": "name" }, { "type": "int", "column": "starsCount" } ] } }, "apply_color_scheme": true, "swing_enabled": true } }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 24 }, { "metadata": {}, "cell_type": "markdown", "source": "## Write DataFrame" }, { "metadata": {}, "cell_type": "markdown", "source": [ "`DataFrame` supports writing to (almost) all formats that it is capable of reading.\n", "\n", "Write to Excel:" ] }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-27T17:21:14.823771Z", "start_time": "2025-05-27T17:21:11.697776Z" } }, "cell_type": "code", "source": "dfWithIsIntellij.writeExcel(\"jb_repos.xlsx\")", "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2025-05-27T17:21:11.899521Z Execution of code 'dfWithIsIntellij.wri...' ERROR Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...\n" ] } ], "execution_count": 25 } ], "metadata": { "kernelspec": { "display_name": "Kotlin", "language": "kotlin", "name": "kotlin" }, "language_info": { "name": "kotlin", "version": "1.9.23", "mimetype": "text/x-kotlin", "file_extension": ".kt", "pygments_lexer": "kotlin", "codemirror_mode": "text/x-kotlin", "nbconvert_exporter": "" } }, "nbformat": 4, "nbformat_minor": 0 }