# New Discussion Tool Instrumentation QA

1. [Differentiate between events emitted from the Reply Tool and the New Discussion Tool](#Differentiate-between-events-emitted-from-the-Reply-Tool-and-the-New-Discussion-Tool)
2. [Differentiate between edits to existing sections and the creation of new sections](#Differentiate-between-edits-to-existing-sections-and-the-creation-of-new-sections)

## Differentiate between events emitted from the Reply Tool and the New Discussion Tool

[Task](https://phabricator.wikimedia.org/T265099)

The EditAttemptStep schema's existing init_type field will be used to differentiate between events emitted from the Reply Tool and the New Discussion Tool.

Events from the Reply Tool and New Discussion Tool should be logged as follows:

* Reply Tool events: event.action = 'init', event.integration = 'discussiontools', event.init_type = 'page'
* New Discussion Tool events: event.action = 'init', event.integration = 'discussiontools', event.init_type = 'section'

The change to the the `init_type` field was made on 12 January 2021.

In [56]:
shhh <- function(expr) suppressPackageStartupMessages(suppressWarnings(suppressMessages(expr)))
shhh({
    library(magrittr); library(zeallot); library(glue); library(tidyverse); library(zoo); library(lubridate)
    library(scales)
})

In [87]:
# Collect init events by discussion tool type
query <-
"
SELECT 
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) as date,
  wiki AS wiki,
  event.editing_session_id AS session_id,
  event.platform as platform,
  event.editor_interface as interface,
  event.init_mechanism as init_mechanism,
  IF(event.init_type = 'section', 'new discussion tool', 'reply tool') as dt_type,
  COUNT(*) as n_events
FROM event.editattemptstep
WHERE
  event.action = 'init'
  AND event.integration = 'discussiontools'
  AND year = 2021
  AND dt >= '2021-01-01'
GROUP BY
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')),
  wiki, 
  event.editing_session_id,
  event.init_mechanism,
  event.platform,
  event.editor_interface,
  IF(event.init_type = 'section', 'new discussion tool', 'reply tool') 
"

In [88]:
collect_init_events <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



## Reply Tool vs New Discussion Tool Events by Date 

In [89]:
dt_events_bytype <- collect_init_events %>%
    group_by(date, dt_type) %>%
    summarise(total_events = sum(n_events))

dt_events_bytype

`summarise()` regrouping output by 'date' (override with `.groups` argument)



date,dt_type,total_events
<chr>,<chr>,<int>
2021-01-01,reply tool,80
2021-01-02,reply tool,83
2021-01-03,reply tool,91
2021-01-04,reply tool,78
2021-01-05,reply tool,89
2021-01-06,reply tool,76
2021-01-07,reply tool,68
2021-01-08,reply tool,64
2021-01-09,reply tool,59
2021-01-10,reply tool,79


Both reply and new discussion tool events are being logged and it possible to differentiate based on the `init_type`. There are fewer `init_type = section` events as these are associated with the new discussion tool, which has not been deployed as long as the reply tool events. 

A total of 24 new discussion tool events have been logged since 21 January 2021 as expected.


## Reply Tool vs New Discussion Tool Events by Platform and Editor Interface

In [90]:
dt_events_byplatform <- collect_init_events %>%
    group_by(dt_type, platform, interface) %>%
    summarise(total_events = sum(n_events))

dt_events_byplatform

`summarise()` regrouping output by 'dt_type', 'platform' (override with `.groups` argument)



dt_type,platform,interface,total_events
<chr>,<chr>,<chr>,<int>
new discussion tool,desktop,visualeditor,8
new discussion tool,desktop,wikitext,16
reply tool,desktop,visualeditor,827
reply tool,desktop,wikitext,1227


Events are recorded for both visualeditor and wikitext and only on platform as expected.

##  New Discussion Tool Events and Unique Sessions by Wiki

In [91]:
dt_events_bywiki <- collect_init_events %>%
    filter(dt_type == "new discussion tool") %>%
    group_by(dt_type, wiki) %>%
    summarise(total_events = sum(n_events),
             distinct_sessions = n_distinct(session_id))

dt_events_bywiki

`summarise()` regrouping output by 'dt_type' (override with `.groups` argument)



dt_type,wiki,total_events,distinct_sessions
<chr>,<chr>,<int>,<int>
new discussion tool,cswiki,7,7
new discussion tool,cswikinews,1,1
new discussion tool,enwiki,16,16


In [None]:
New discussion tool events have been recorded on enwiki, cswikinews, and cswiki.

## Reply Tool vs New Discussion Tool Events by Init Mechanism

In [54]:
dt_events_bymechanism <- collect_init_events %>%
    group_by(dt_type, init_mechanism) %>%
    summarise(total_events = sum(n_events))

dt_events_bymechanism 

`summarise()` regrouping output by 'dt_type' (override with `.groups` argument)



dt_type,init_mechanism,total_events
<chr>,<chr>,<int>
new discussion tool,click,24
reply tool,click,2048


Both the new discussion tool and reply tool events to date have been recorded as click events; however, `Init_mechanism` is not not needed to distinguish these two event types so this is fine. Changes will be needed to track new section events using the existing workflow, which will be done as part of [T272544](https://phabricator.wikimedia.org/T272544)

## Reply Tool vs New Discussion Tool Edit Completion Rate

Check to make sure it will be possible to calculate edit completion rate for each tool type, which is one of the key metrics for this tool.

In [71]:
query <- 
"WITH init_sessions AS (
--first find all dt and reply tool events based on init type
SELECT 
  event.editing_session_id AS session_id,
  IF(event.init_type = 'section', 'new discussion tool', 'reply tool') as dt_type,
  wiki AS wiki
FROM event.editattemptstep
WHERE
  year = 2021 
  AND dt >= '2021-01-12'  -- when instrumetation was deployed
  AND event.action = 'init'
  AND event.integration= 'discussiontools'
)

-- Find associated savesuccess events
SELECT
  eas.event.user_editcount AS edit_count,
  eas.event.user_id AS user,
  init_sessions.dt_type as dt_type,
  eas.event.editing_session_id AS session_id,
  eas.wiki AS wiki,
  COUNT(*) AS save_events
FROM event.editattemptstep eas
INNER JOIN
    init_sessions 
    ON eas.event.editing_session_id = init_sessions.session_id 
    AND eas.wiki = init_sessions.wiki
WHERE
  year = 2021 
-- events since deployment date
  AND dt >= '2021-01-12'
  AND eas.event.action = 'saveSuccess'
  AND eas.event.integration= 'discussiontools'
-- remove anonymous users
  AND eas.event.user_id != 0
GROUP BY 
  eas.event.user_id,
  init_sessions.dt_type,
  eas.event.user_editcount,
  eas.event.editing_session_id,
  eas.wiki
"

In [72]:
collect_savesuccess_events <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [99]:
dt_save_events_bytype <- collect_savesuccess_events %>%
    group_by ( dt_type)  %>%
    summarize (num_save_sessions = n_distinct(session_id),
              num_save_events = sum(save_events))

dt_save_events_bytype

`summarise()` ungrouping output (override with `.groups` argument)



dt_type,num_save_sessions,num_save_events
<chr>,<int>,<int>
new discussion tool,6,6
reply tool,856,856


In [100]:
new_dt_save_events_bywiki <- collect_savesuccess_events %>%
    filter(dt_type == 'new discussion tool') %>%
    group_by (wiki, dt_type)  %>%
    summarize (num_save_sessions = n_distinct(session_id))

new_dt_save_events_bywiki

`summarise()` regrouping output by 'wiki' (override with `.groups` argument)



wiki,dt_type,num_save_sessions
<chr>,<chr>,<int>
cswiki,new discussion tool,1
enwiki,new discussion tool,5


A total of 6 new discussion tool sessions met `saveSuccess`. These are the same wikis where new discussion init events were also logged.


# Differentiate between edits to existing sections and the creation of new sections

[Task](https://phabricator.wikimedia.org/T272544)

## Background

New events were added to EditAttemptStep to enable the software to distinguish edits to existing sections from edits associated with the creation of new sections.

Notes re instrumentation:
- This is explicitly a change to the existing logging in VisualEditor / WikiEditor to get the semantics of init_mechanism tweaked so that (a) they're consistent, and (b) you can always tell whether there's a new section being created.
- No impact or change to DiscussionTools instrumentation, unless and until we implement a takeover for the section=new URL.
- These patches add a new possible value for `init_mechanism: 'url-new'`. It will be logged when direct navigation occurs to a URL that triggers an editor for either a new page or a new section.
- The VE patch also makes its use of `init_mechanism=new` consistent with WikiEditor/the-schema-docs. It was previously not using new when you clicked the "Create" tab after navigating to a non-existent page. (You could still distinguish this case by looking for page_id=0.)





In [5]:
# Collect all init events by date since deployment
query <-
"
SELECT 
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) as edit_dt,
  wiki AS wiki,
  event.editing_session_id AS session_id,
  event.editor_interface as interface,
  event.init_mechanism as init_mechanism,
  event.init_type as init_type,
  event.integration as integration,
  COUNT(*) as n_events
FROM event.editattemptstep
WHERE
  event.action = 'init'
  AND year = 2021
-- review events following deployment
  AND dt >= '2021-02-15'
  AND event.platform = 'desktop'
GROUP BY
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')),
  wiki, 
  event.editing_session_id,
  event.init_mechanism,
  event.editor_interface,
  event.init_mechanism,
  event.init_type,
  event.integration
"

In [6]:
collect_init_events_all <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



## Init Events by Init Mechanism and Type


In [11]:
init_events_bymechanism <- collect_init_events_all %>%
    group_by(init_mechanism, init_type) %>%
    summarise(n_events = sum(n_events))

init_events_bymechanism

`summarise()` regrouping output by 'init_mechanism' (override with `.groups` argument)



init_mechanism,init_type,n_events
<chr>,<chr>,<int>
click,page,244735
click,section,181952
new,page,156614
new,section,7989
url,page,112989
url,section,165317
url-new,page,223159
url-new,section,2285


We are logging both `init_mechanism = 'url-new'` and `init_mechanism = 'url` events for page and section init_types. This will allow us to now distinguish between new and existing section or page edits that occur from direct naviagation to a URL from existing logging in VisualEditor/WikiEditor. 

The number of logged events for each init_mechanism type seem reasonable given the liklihood of occurrence. The majority of new section edits (78%) are created by clicking on a link to a page vs direct naviagation to a URL that triggers an editor for a new section. The majority of edits from direct navigation to a url (98.9%) are to `init_type = page` events.


## Init Events By Integration Type

In [12]:
init_events_byintegration <- collect_init_events_all %>%
    group_by(init_mechanism, init_type, integration) %>%
    summarise(n_events = sum(n_events))

init_events_byintegration

`summarise()` regrouping output by 'init_mechanism', 'init_type' (override with `.groups` argument)



init_mechanism,init_type,integration,n_events
<chr>,<chr>,<chr>,<int>
click,page,discussiontools,9418
click,page,page,235317
click,section,discussiontools,459
click,section,page,181493
new,page,page,156614
new,section,page,7989
url,page,page,112989
url,section,page,165317
url-new,page,page,223159
url-new,section,page,2285


Discussion tool events (as indicated by `event.integration = 'discussiontools`) are only recorded for `init_mechnaism = click` events (No `event.mechanism = new` events were recorded for these events). THis is expected as per DLynch's comment, the changes were not implemented for DiscussionTools. We can still distinguish new vs existing events for discussion tools based on current instrumentation. See https://phabricator.wikimedia.org/T265099 for details. 

Non-Discussion tool events (as indicated by `event.integration = 'page`) are recorded for all expected init_mechamism types (click, new, url, and url-new) for both page and section types.

## Init Events By Editor Interface

In [13]:
init_events_byinterface <- collect_init_events_all %>%
    group_by(init_mechanism, init_type, interface) %>%
    summarise(n_events = sum(n_events))

init_events_byinterface

`summarise()` regrouping output by 'init_mechanism', 'init_type' (override with `.groups` argument)



init_mechanism,init_type,interface,n_events
<chr>,<chr>,<chr>,<int>
click,page,visualeditor,37021
click,page,wikitext,204263
click,page,wikitext-2017,3451
click,section,visualeditor,20846
click,section,wikitext,159063
click,section,wikitext-2017,2043
new,page,visualeditor,3393
new,page,wikitext,153200
new,page,wikitext-2017,21
new,section,wikitext,7987


All init_mecahnism types are recorded for all three editor interfaces as expected. The `init_mechanism = new` is now recorded for VisualEditor events. 

## Init Events on Talk Pages Only

In [19]:
# Collect all init events on talk pages only by date since deployment
query <-
"
SELECT 
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) as edit_dt,
  wiki AS wiki,
  event.editing_session_id AS session_id,
  event.editor_interface as interface,
  event.init_mechanism as init_mechanism,
  event.init_type as init_type,
  event.integration as integration,
  COUNT(*) as n_events
FROM event.editattemptstep
WHERE
  event.action = 'init'
  AND year = 2021
-- review events following deployment
  AND dt >= '2021-02-15'
  AND event.platform = 'desktop'
 -- review all talk namespaces
  AND event.page_ns % 2 = 1
GROUP BY
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')),
  wiki, 
  event.editing_session_id,
  event.editor_interface,
  event.init_mechanism,
  event.editor_interface,
  event.init_mechanism,
  event.init_type,
  event.integration
"

In [20]:
collect_init_events_talkonly <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [21]:
head(collect_init_events_talkonly)

Unnamed: 0_level_0,edit_dt,wiki,session_id,interface,init_mechanism,init_type,integration,n_events
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>
1,2021-02-15,abwiki,92c42880482c2ee6e2a5afdce7aee3e4,wikitext,url,page,page,1
2,2021-02-15,acewiki,10f0051052705382eaa8b72e6fb8848f,wikitext,url,page,page,1
3,2021-02-15,afwiki,293c83b05e3e35449759039616f67fb3,wikitext,click,page,page,1
4,2021-02-15,afwiktionary,3561b9c0202922f1a30ab028fa056756,wikitext,url-new,page,page,1
5,2021-02-15,arwiki,1cb977f06ad501ba75cd26c1f6f7b6be,wikitext,new,page,page,1
6,2021-02-15,arwiki,9648bf608a5058d0f2bfb8c0dda02692,wikitext,new,page,page,1


## Talk Page Init Events by Integration and Init Type

In [22]:
talk_init_events_byintegration <- collect_init_events_talkonly %>%
    group_by(init_mechanism, init_type, integration) %>%
    summarise(n_events = sum(n_events))

talk_init_events_byintegration

`summarise()` regrouping output by 'init_mechanism', 'init_type' (override with `.groups` argument)



init_mechanism,init_type,integration,n_events
<chr>,<chr>,<chr>,<int>
click,page,discussiontools,6452
click,page,page,7242
click,section,discussiontools,443
click,section,page,7319
new,page,page,25091
new,section,page,6097
url,page,page,4387
url,section,page,3226
url-new,page,page,20379
url-new,section,page,1543


All expected init types are logged on talk pages. The number of logged events for each init_mechanism type seem reasonable given the liklihood of occurrence. 

The majority of new section edits on talk pages (80%) are created by clicking on a link to a page vs direct naviagation to a URL that triggers an editor for a new section. The majority of edits from direct navigation to a url (92.9%) are to `init_type = page` events.



## Talk Page Init Events by Editor Interface

In [25]:
talk_init_events_byinterface <- collect_init_events_talkonly %>%
    group_by(integration, init_mechanism, init_type, interface) %>%
    summarise(n_events = sum(n_events))

talk_init_events_byinterface

`summarise()` regrouping output by 'integration', 'init_mechanism', 'init_type' (override with `.groups` argument)



integration,init_mechanism,init_type,interface,n_events
<chr>,<chr>,<chr>,<chr>,<int>
discussiontools,click,page,visualeditor,2841
discussiontools,click,page,wikitext,3611
discussiontools,click,section,visualeditor,141
discussiontools,click,section,wikitext,302
page,click,page,visualeditor,2
page,click,page,wikitext,7123
page,click,page,wikitext-2017,117
page,click,section,visualeditor,1
page,click,section,wikitext,7214
page,click,section,wikitext-2017,104


Most all visual editor events on talk pages are recorded for discussiontools events as expected. 

## Summary of New and Existing Section Edits on Talk Pages

In [36]:
existing_vs_new_section_edits <- collect_init_events_talkonly %>%
    filter(integration == 'page',
           init_type == 'section') %>%
    mutate(section_status = ifelse(init_mechanism == 'url-new'|init_mechanism == 'new', 'new', 'existing'))%>%
    group_by(section_status, init_mechanism) %>%
    summarise(n_events = sum(n_events))

existing_vs_new_section_edits

`summarise()` regrouping output by 'section_status' (override with `.groups` argument)



section_status,init_mechanism,n_events
<chr>,<chr>,<int>
existing,click,7319
existing,url,3226
new,new,6097
new,url-new,1543


The table above shows the number of new vs existing section edits using current VisualEditor / WikiEditor section workflows (not disucssion tool related edits).

The number of logged events for each section edit type seem reasonable given the liklihood of occurrence. Since deployment of the instrumentation changes, a slight majority of section edits on talk pages have been to existing sections (58%). 

Most edits (79.8%) to create new sections on talk pages are completed by clicking the link on the page vs direct navigation to a url as expected.


# Confirm save_sucess_timing and other EditAttemptStep Events are logged for DT related events

[Task](https://phabricator.wikimedia.org/T290931)
Notes: Fix to log save_success_timing was deployed on 16 September 2021

## Save Success Events

In [57]:
# Collect savesuccess events by discussion tool type
query <-
"
SELECT 
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) as `date`,
  wiki AS wiki,
  event.editing_session_id AS session_id,
  event.platform as platform,
  event.editor_interface as interface,
  event.save_success_timing As save_success_timing
FROM event.editattemptstep
WHERE
  event.action = 'saveSuccess'
  AND event.integration = 'discussiontools'
  AND year = 2021
  AND dt >= '2021-09-16'
"

In [58]:
save_success_events <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [59]:
save_success_events$date <- as.Date(save_success_events$date, format = "%Y-%m-%d")

In [62]:
save_success_events %>%
select(-3) %>%
filter(date <= '2021-09-17')%>%
arrange(date)

date,wiki,platform,interface,save_success_timing
<date>,<chr>,<chr>,<chr>,<int>
2021-09-16,fawiki,desktop,visualeditor,-1
2021-09-16,simplewiki,desktop,wikitext-2017,-1
2021-09-16,hewiki,desktop,wikitext-2017,18237
2021-09-16,enwiki,desktop,wikitext-2017,-1
2021-09-16,plwiki,desktop,wikitext-2017,-1
2021-09-16,zhwiki,desktop,wikitext-2017,-1
2021-09-16,itwiki,desktop,wikitext-2017,6455
2021-09-16,simplewiki,desktop,wikitext-2017,-1
2021-09-16,arwiki,desktop,visualeditor,-1
2021-09-16,frwiki,desktop,wikitext-2017,-1


Confirmed we start recorded save success timing for on 17 Setember 2021 for all save success discussion tool related events.

Are there any NULL events recorded after 17 September 2021?

In [63]:
save_success_events %>%
filter(date >= '2021-09-17',
       save_success_timing == -1)

date,wiki,session_id,platform,interface,save_success_timing
<date>,<chr>,<chr>,<chr>,<chr>,<int>
2021-09-18,nlwiki,8269305fab50769c18b0,desktop,visualeditor,-1


There is one save success event missing save success timing but this field is populating for all other events as expected. Not sure what's happening here.

Next I'll take a quick look at all events logged for discussion tool events to confirm if any other fields are missing

In [28]:
# collect day of all dt related events 
query <-
"
SELECT 
  event.action,
  event.init_type,
  event.init_mechanism,
  event.init_timing,
  event.ready_timing,
  event.loaded_timing,
  event.first_change_timing,
  event.save_intent_timing,
 event.save_attempt_timing,
  event.save_success_timing,
  event.save_failure_type,
  event.save_failure_message,
  event.abort_type,
  event.abort_mechanism,
  event.abort_timing,
 event.editor_interface,
    event.platform,
    event.page_id,
    event.page_ns,
    event.page_title
FROM event.editattemptstep
WHERE
 event.integration = 'discussiontools'
  AND year = 2021
  AND dt >= '2021-09-17'
"

In [29]:
all_dt_events <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



## Init events

In [42]:
init_dt_events <- all_dt_events %>%
    filter(action == 'init')

head(init_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,init,page,click,,,,,,,,,,,,,wikitext-2017,desktop,2544264,3,بحث_کاربر:Jeeputer
2,init,page,click,,,,,,,,,,,,,wikitext-2017,desktop,202483,4,Wikipedia:管理员布告板/3RR
3,init,page,click,,,,,,,,,,,,,visualeditor,desktop,4799201,1,Discussion:Rokhaya_Diallo
4,init,page,click,,,,,,,,,,,,,wikitext-2017,desktop,255030,4,Wikipedia:優良條目評選
5,init,section,click,,,,,,,,,,,,,visualeditor,desktop,33864885,3,User_talk:Urbwek
6,init,section,click,,,,,,,,,,,,,wikitext-2017,desktop,0,3,User_talk:135.0.163.150


In [43]:
# Ready Events
ready_dt_events <- all_dt_events %>%
    filter(action == 'ready')

head(ready_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,ready,,,,380,,,,,,,,,,,wikitext-2017,desktop,34015,3,Brukerdiskusjon:Jon_Harald_Søby
2,ready,,,,266,,,,,,,,,,,visualeditor,desktop,5478960,4,ویکی‌پدیا:گزیدن_مقاله‌های_خوب/هیولا_(مجموعه_نمایش_خانگی)
3,ready,,,,174,,,,,,,,,,,visualeditor,desktop,0,1,Diskuse:Kropáč_(zbraň)
4,ready,,,,170,,,,,,,,,,,wikitext-2017,desktop,2023601,15,שיחת_קטגוריה:טיים_100_נקסט
5,ready,,,,86,,,,,,,,,,,wikitext-2017,desktop,1772268,1,Vita:A_bolgár_újjászületés_irodalma
6,ready,,,,25408,,,,,,,,,,,wikitext-2017,desktop,2544264,3,بحث_کاربر:Jeeputer


In [44]:
# Loaded Events
loaded_dt_events <- all_dt_events %>%
    filter(action == 'loaded')

head(loaded_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,loaded,,,,,297,,,,,,,,,,visualeditor,desktop,0,15,نقاش_التصنيف:قبائل_العرب_في_الجاهلية
2,loaded,,,,,206,,,,,,,,,,wikitext-2017,desktop,34015,3,Brukerdiskusjon:Jon_Harald_Søby
3,loaded,,,,,82,,,,,,,,,,wikitext-2017,desktop,13822,4,ויקיפדיה:מזנון
4,loaded,,,,,121,,,,,,,,,,wikitext-2017,desktop,10020594,5,Wikipedia_discusión:Votaciones/2021/Solicitud_de_desactivación_de_miniatura_estática
5,loaded,,,,,134,,,,,,,,,,wikitext-2017,desktop,3557124,1,ノート:BanG_Dream!
6,loaded,,,,,688,,,,,,,,,,wikitext-2017,desktop,548007,4,Wikipedia:Torget


In [36]:
# First Change Events
firstChange_dt_events <- all_dt_events %>%
    filter(action == 'firstChange')

head(firstChange_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,firstChange,,,,,,3979,,,,,,,,,wikitext-2017,desktop,4336380,1,Обговорення:Апостроф_TV
2,firstChange,,,,,,1626,,,,,,,,,wikitext-2017,desktop,534560,3,Overleg_gebruiker:Ecritures
3,firstChange,,,,,,2602,,,,,,,,,wikitext-2017,desktop,2023686,1,שיחה:הרפובליקה_של_אי_הוורדים
4,firstChange,,,,,,3509,,,,,,,,,wikitext-2017,desktop,69093,4,Wikipédia:Botgazdák_üzenőfala
5,firstChange,,,,,,2757,,,,,,,,,wikitext-2017,desktop,6389104,4,Wikipedia:可靠来源/布告板
6,firstChange,,,,,,1472,,,,,,,,,wikitext-2017,desktop,2422050,1,Discussion:Groupe_Pictet


In [37]:
# Save Intent Events
saveintent_dt_events <- all_dt_events %>%
    filter(action == 'saveIntent')

head(saveintent_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,saveIntent,,,,,,,196479,,,,,,,,wikitext-2017,desktop,13822,4,ויקיפדיה:מזנון
2,saveIntent,,,,,,,18022,,,,,,,,wikitext-2017,desktop,703892,4,Wikipedie:Nástěnka_správců
3,saveIntent,,,,,,,32788,,,,,,,,wikitext-2017,desktop,0,3,Benutzer_Diskussion:HoppyFloppy
4,saveIntent,,,,,,,46054,,,,,,,,wikitext-2017,desktop,2068919,3,Dyskusja_wikipedysty:Piotrus
5,saveIntent,,,,,,,200911,,,,,,,,wikitext-2017,desktop,69093,4,Wikipédia:Botgazdák_üzenőfala
6,saveIntent,,,,,,,341297,,,,,,,,wikitext-2017,desktop,255030,4,Wikipedia:優良條目評選


In [38]:
# Save Attempt Events
saveattempt_dt_events <- all_dt_events %>%
    filter(action == 'saveAttempt')

head(saveattempt_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,saveAttempt,,,,,,,,0,,,,,,,wikitext-2017,desktop,68745119,3,User_talk:Ankitnaithani1999
2,saveAttempt,,,,,,,,0,,,,,,,wikitext-2017,desktop,5150247,4,Wikipedia:Poczekalnia/artykuły/2021:09:16:Pajdokracja
3,saveAttempt,,,,,,,,8,,,,,,,visualeditor,desktop,13822,4,ויקיפדיה:מזנון
4,saveAttempt,,,,,,,,1,,,,,,,wikitext-2017,desktop,4336380,1,Обговорення:Апостроф_TV
5,saveAttempt,,,,,,,,0,,,,,,,wikitext-2017,desktop,1518799,3,שיחת_משתמש:62.219.74.216
6,saveAttempt,,,,,,,,3,,,,,,,wikitext-2017,desktop,4197471,11,Discussioni_template:Imbarcazione_storica


In [39]:
# Save Success Events
savesuccess_dt_events <- all_dt_events %>%
    filter(action == 'saveSuccess')

head(savesuccess_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,saveSuccess,,,,,,,,,4801,,,,,,wikitext-2017,desktop,19642030,4,Wikipedia:Biểu_quyết_xoá_bài/Danh_sách_bản_quyền_thể_thao_tại_Việt_Nam
2,saveSuccess,,,,,,,,,4889,,,,,,wikitext-2017,desktop,1867796,3,שיחת_משתמש:Neriah
3,saveSuccess,,,,,,,,,5561,,,,,,visualeditor,desktop,398195,12,Ajuda:Tire_suas_dúvidas
4,saveSuccess,,,,,,,,,15573,,,,,,visualeditor,desktop,24767,4,Вікіпедія:Кнайпа_(різне)
5,saveSuccess,,,,,,,,,2638,,,,,,wikitext-2017,desktop,4336380,1,Обговорення:Апостроф_TV
6,saveSuccess,,,,,,,,,599,,,,,,wikitext-2017,desktop,11924196,3,Benutzer_Diskussion:HoppyFloppy


In [40]:
# Save Failure Events
savefailure_dt_events <- all_dt_events %>%
    filter(action == 'saveFailure')

head(savefailure_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,saveFailure,,,,,,,,,,responseUnknown,http-0,,,,visualeditor,desktop,9122621,3,Discussioni_utente:Loffry_1
2,saveFailure,,,,,,,,,,responseUnknown,http-0,,,,wikitext-2017,desktop,5810790,3,بحث_کاربر:Prvizprvizi
3,saveFailure,,,,,,,,,,responseUnknown,http-0,,,,visualeditor,desktop,3878720,3,Discussioni_utente:Mmagalini
4,saveFailure,,,,,,,,,,responseUnknown,http-0,,,,wikitext-2017,desktop,3044602,3,사용자토론:211.217.64.35
5,saveFailure,,,,,,,,,,responseUnknown,http-0,,,,visualeditor,desktop,9122621,3,Discussioni_utente:Loffry_1
6,saveFailure,,,,,,,,,,responseUnknown,http-0,,,,visualeditor,desktop,9124823,3,Discussioni_utente:80.182.52.231


In [41]:
# Abort Events
abort_dt_events <- all_dt_events %>%
    filter(action == 'abort')

head(abort_dt_events)

Unnamed: 0_level_0,action,init_type,init_mechanism,init_timing,ready_timing,loaded_timing,first_change_timing,save_intent_timing,save_attempt_timing,save_success_timing,save_failure_type,save_failure_message,abort_type,abort_mechanism,abort_timing,editor_interface,platform,page_id,page_ns,page_title
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<int>,<chr>
1,abort,,,,,,,,,,,,nochange,cancel,2683,wikitext-2017,desktop,34015,3,Brukerdiskusjon:Jon_Harald_Søby
2,abort,,,,,,,,,,,,nochange,cancel,33352,wikitext-2017,desktop,69093,4,Wikipédia:Botgazdák_üzenőfala
3,abort,,,,,,,,,,,,nochange,cancel,6432,visualeditor,desktop,11105803,3,Benutzer_Diskussion:Julius2803
4,abort,,,,,,,,,,,,nochange,cancel,4760,wikitext-2017,desktop,7549798,4,Wikipedia:修订版本删除请求/存档/2021年7月
5,abort,,,,,,,,,,,,preinit,,44,visualeditor,desktop,403844,4,ويكيبيديا:طلبات_صلاحيات
6,abort,,,,,,,,,,,,nochange,navigate,14997,wikitext-2017,desktop,635987,3,Keskustelu_käyttäjästä:137.163.31.188


Missing Fields:
- init_timing is NUll for init events. Check if this is True for non dt events.
- Ready_timing is filled.
- Loaded_timing is filled.
- FirstChange timing is filled.
- SaveIntent timing is filled.
- SaveAttempt timing is filled.
- SaveSuccess timing is filled.
- SaveFailurType and SaveFailure Message recorded for savefailure events
- All Abort events fileed: abort_type, abort_mechanism, abort_timing



In [None]:
## Check if init_timing is missing for non-DT events as well

In [67]:
# collect day of all init related events 
query <-
"
SELECT 
  event.integration,
  event.init_type,
  event.init_mechanism,
  event.init_timing,
 event.editor_interface,
    event.platform
FROM event.editattemptstep
WHERE
   year = 2021
  AND dt >= '2021-09-01'
 AND event.action = 'init'
"

In [68]:
all_init_events <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [69]:
all_init_events

integration,init_type,init_mechanism,init_timing,editor_interface,platform
<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
page,section,click,,wikitext,phone
page,section,click,,visualeditor,phone
page,section,url,,wikitext,desktop
page,page,click,,wikitext,desktop
page,section,click,,visualeditor,phone
page,section,click,,wikitext,phone
page,section,click,,wikitext,phone
page,page,click,,wikitext,desktop
page,page,click,,wikitext,desktop
page,section,click,,visualeditor,phone


Init_timing is currently NULL for all events.