Segmented Telemetry Data Filter - Administrator's manualEduard TibetDocBook XSL Stylesheets with Apache FOPSegmented Telemetry Data FilterTable of Contents1. Introduction1.1. Scope of this document1.2. Document structure2. Description of the STDF2.1. Brief description of the STDF2.2. Overall design of STDF3. Installation of the software3.1. System requirements3.2. User qualification3.3. Installation process of components3.3.1. Getting packages of components3.3.2. Installation of a coordinator component3.3.3. Installation of a loadbalancer component3.3.4. Installation of a filtering component4. Authoring filtering rules5. Using and verifying filtered data6. Troubleshooting6.1. Problem: no connection from a filter node to a coordinator6.2. Problem: filtering node doesn't receive filtering rules6.3. Problem: filtering node doesn't receive data6.4. Problem: loadbalancer doesn't receive any data6.5. Problem: Filter produces incorrect resultsA. Technology stack behind this sample documentB. LicenseSegmented Telemetry Data FilterSegmented Telemetry Data FilterSegmented Telemetry Data FilterAdministrator's manualEduard Tibet28.03.2022Table of Contents1. Introduction1.1. Scope of this document1.2. Document structure2. Description of the STDF2.1. Brief description of the STDF2.2. Overall design of STDF3. Installation of the software3.1. System requirements3.2. User qualification3.3. Installation process of components4. Authoring filtering rules5. Using and verifying filtered data6. Troubleshooting6.1. Problem: no connection from a filter node to a coordinator6.2. Problem: filtering node doesn't receive filtering rules6.3. Problem: filtering node doesn't receive data6.4. Problem: loadbalancer doesn't receive any data6.5. Problem: Filter produces incorrect resultsA. Technology stack behind this sample documentB. LicenseIntroduction1. IntroductionScope of this document1.1. Scope of this documentThis is a complete administrator's manual of the Segmented Telemetry Data Filter (STDF) software. It describes in a brief what STDF is proposed for, its overall design, what each component is indented for. Also this manual includes a full information about an installation process and usage of STDF. The theory and principles of data filtering, explanation of the Erlang language syntax (used for data filtering) are completely out of scope of this manual.Document structure1.2. Document structureThis document includes a following parts:Introduction - current section.Description of the STDF - a description of the software's overall design, features and functionality.Installation of the software - the information about system requirements and installation of the software.Authoring filtering rules - current section describes, how to create and mastering filtering rules required to be deployed into the one of the software component.Using and verifying filtered data - section about customizing and fine tuning final data.Troubleshooting - list of possible issues and ways to resolve them.Description of the STDF2. Description of the STDFBrief description of the STDF2.1. Brief description of the STDFSTDF is a data handling software designed to help in capturing high speed telemetry data. The purpose of the STDF is to automatically and linearly scale processing capacity for such data. The STDF segments data into smaller chunks and sends them through a load balancer to several servers that filter received data. That way it is possible to:avoid using a single high-powered processing unit working with data;reduce power of any unit, used for processing;deploy the system with a great flexibility and scalability, based on various initial requirements and/or conditions.Overall design of STDF2.2. Overall design of STDFThe system contains of several parts:coordinator component (node) - is used for smart management of the whole system;loadbalancer component (node) - is used for receiving raw data from external sources (i.e. sensors) and transfer it further based on coordinator's directives;filter component(s)/node(s) - are used to process data received from the loadbalancer. Processing is based on the current workload. If it exceeds the maximum, defined by a coordinator, data chunks automatically migrate to other filter nodes, which free resources are enough to manipulate the data. The number of filter components within installation varies and based on current performance needs.In the heart of the STDF is a proprietary protocol that was developed by Teliota company. This protocol can be used between components to coordinate data manipulation, calculation on individual filters, running on each server, and data migration between filters.The typical workflow includes the following steps:1. loadbalancer component receives all-raw data from external sources (i.e. sensors) and transmit it further to filters based on coordinator's current workload rules and internal logic;2. filter component receives an independent dataset from the loadbalancer and asks a cluster's coordinator to supply a filtering rules;3. coordinator provides a rules to the filter and then rules are applied on-the-fly onto the incoming data, received from the loadbalancer;Each filtering component can talk to a coordinator component about the data it is processing or wishes to process. The coordinator component steers the loadbalancer component what data a loadbalancer should provide to which filter node.Figure 1. Overall design of STDFIf a filter component gets overloaded by the data, its tasks can be offloaded to another filter node. Due to the nature of the workflow, the algorithm assumes that:a sufficient number of such redundant servers (filter modes) exists in the pool as during an overload situation;the offloaded data is similar to the original data and can be filtered with same rules.An offloaded filter node is, therefore, not "independent". It have to process the same data and instructions as its peer until the moment an overload situation is resolved.New processing (filter) nodes can be added into the processing cluster on the fly by:1. adding new server hardware;2. installing the filter component software onto it;3. configuring the coordinator server address.The filter node will register itself to the coordinator and the coordinator will instruct the loadbalancer to forward traffic to this new node.Telemetry data and filter operations are defined with a definition file that in turn is written in a proprietary filter rule language. The language defines in details:what the incoming data is stands for;how the data may be aggregated and filtered out in case of outliers or unwanted values are found.The coordinator reads the filter language files and runs them on its own logic processing engine. This engine is connected to all the filtering nodes, which receives processing instructions in the form of a proprietary, compressed command protocol. The protocol is bidirectional:filter nodes and the loadbalancer inform the coordinator about data they receive and their status.coordinator instructs:loadbalancer - where to deploy initial raw-based data;filters - what data is and how that data should be manipulated over.Installation of the software3. Installation of the softwareSystem requirements3.1. System requirementsTo successfully install and run STDF, your base hardware/software installation have to be complied with the following requirements:Two (2) dedicated hardware servers for a coordinator and a loadbalancer components;no other application software (i.e. MTA, DB, etc.), except of an operating system and system utilities should be installed on the above servers;required amount of servers that will be used as hosts for a filtering components (nodes);network connectivity with all sensors that gather information for your application - your firewall rules should allow sensors to access the STDF cluster (loadbalancer component);network connectivity within all components of the STDF installation and data receivers beyond the STDF deployment (DB or third-party application servers);any recent Linux distribution with a kernel 2.6.32 or later;standard (base) Linux utilities, including:tar - utility to work with .tar files;wget - utility to get packages from the distribution server;any console text editors to edit configuration files - i.e. vim, nano, etc.User qualification3.2. User qualificationTo install and maintain STDF system administrator have to have:skills equals to those, that are enough to successfully pass the LPIC-2 exam;some knowledge of Erlang language syntax to write filtering rules.read throughly a "STDF filtering rules language reference" manual (supplied by Teliota separately).Installation process of components3.3. Installation process of components3.3.1. Getting packages of componentsAll packages are to be downloaded from a Teliota distribution web server: https://download.teliota.com .3.3.2. Installation of a coordinator componentTo install a coordinator component:1. Go the the top level installation directory.2. Make a directory for coordinator's files:$ mkdir stdf_coordinator3. Change a directory to the recently created one:$ cd stdf_coordinator4. Download the package with a coordinator component:$ wget https://download.teliota.com/bin/stdf_coordinator.tar.bz25. Untar coordinator component files:$ tar -xjf stdf_coordinator.tar.bz26. Open configuration file config.ini in any text editor and set up the IP and port that coordinator component should listen on:COORDINATOR_SERVER_LISTEN_IP=192.168.2.53 COORDINATOR_SERVER_LISTEN_PORT=8860 7. Change directory the bin/ folder:$ cd bin/8. Check if the file stdf_coordinator.sh have an execution bit turned on.9. Run the coordinator:$ ./stdf_coordinator.shThe coordinator is needed to be fed by filtering rules. The coordinator includes a separate language parsing and debugging tool which validates a filter rule.NoteIt is assumed that you have filtering rules already written. If you haven't any rule written yet, first check the section Authoring filtering rules.To deploy a filtering rule:1. Check the filtering rule:$ ./stdf_parser.sh -i [rulefile1]2. If there are any output messages - read them carefully. These messages also saved within a log file for the future analysis.3. Copy the rule file to a filter_rules directory within the coordinator installation:$ cp [rulefile1] ../filter_rules4. Open configuration file config.ini in any text editor and add recently copied file into the coordinator's configuration file:COORDINATOR_RULES_FILES=rulefile1,rulefile25. Restart the coordinator component:$ ./stdf_coordinator.sh restart3.3.3. Installation of a loadbalancer componentTo install a loadbalancer component:1. Change a current directory to the top level installation one.2. Make a directory for the loadbalancer component files:$ mkdir stdf_loadbalancer3. Change a directory to the recently created one:$ cd stdf_loadbalancer4. Download the package with a loadbalancer component:$ wget https://download.teliota.com/bin/stdf_loadbalancer.tar.bz25. Untar the loadbalancer component files:$ tar -xjf stdf_loadbalancer.tar.bz26. Open configuration file config.ini in any text editor and point the loadbalancer to the coordinator's IP address and port number:COORDINATOR_SERVER_IP=192.168.2.53 COORDINATOR_SERVER_PORT=8860 7. Change directory to the bin/ folder:$ cd ./bin8. Check if the file stdf_loadbalancer.sh have an execution bit turned on.9. Run the loadbalancer component:$ ./stdf_loadbalancer.sh3.3.4. Installation of a filtering componentTo install a filtering component:1. Change a current directory to the top level installation one.2. Make a directory for filtering component files:$ mkdir stdf_node3. Change a directory to the recently created one:$ cd stdf_node4. Download the package with a filtering component:$ wget https://download.teliota.com/bin/stdf_node.tar.bz25. Untar the filtering component files:$ tar -xjf stdf_node.tar.bz26. Open configuration file config.ini in any text editor and point the filtering component to the coordinator's IP address and port number:COORDINATOR_SERVER_IP=192.168.2.53 COORDINATOR_SERVER_PORT=8860 7. Change directory to the bin/ folder:$ cd ./bin8. Check if the file stdf_node.sh have an execution bit turned on.9. Run the filtering component:$ ./stdf_node.sh10. Repeat above steps for all filter components are to be installed.11. Start feeding data into the data interface of the loadbalancer component.Authoring filtering rules4. Authoring filtering rulesNoteThis section only briefly describes filtering rules structure. For a detailed information take a look into the "STDF filtering rules language reference" manual (supplied separately).Filtering rules are defined utilizing a filtering language that uses Erlang language syntax as a basis.Each filtering rule includes three elements (so called "definitions"):data definition - describes nature of data to be filtered, including the pattern how the incoming data can be recognized (e.g. port, input url, data header); the data definition assigns an identifier to the dataset so that the data correlation and filter rules can refer to it;correlation definition - describes how that data depends on itself or some other identified dataset;filter definition - describes what actions are to be taken for the data, when it arrives.Using and verifying filtered data5. Using and verifying filtered dataThe filtering cluster appoints one of its nodes automatically as a forwarder, based on the load of the servers. The forwarder collects the data from each filtering node, combines it into one stream, and sends it to whatever server is designated as the final receiver (destination).ImportantThe filtering components (nodes) don't store any data - they only perform filtering. You have to define and configure the storage server beyond the STDF deployment that will perform any and all database processing. A connection to a designated DB server is configured within a coordinator component configuration file config.ini.The forwarder can optionally inject additional data headers and trailers into the initial data block for easier recognition of its nature - source transmitter/generator. The trailer may contain a CRC for checking data integrity. The algorithm for the CRC is shown below:def crc16(self, buff, crc = 0, poly = 0xa001): l = len(buff) i = 0 while i < l: ch = buff[i] uc = 0 while uc < 8: if (crc & 1) ^ (ch & 1): crc = (crc >> 1) ^ poly else: crc >>= 1 ch >>= 1 uc += 1 i += 1 return crc crc_byte_high = (crc >> 8) crc_byte_low = (crc & 0xFF)Troubleshooting6. TroubleshootingProblem: no connection from a filter node to a coordinator6.1. Problem: no connection from a filter node to a coordinatorPossible reasonsHow to solve a problemAny of coordinator's node IP settings of a filter node are not correct or were not set.Check for a correct IP and port numbers of filters.Firewall rules don't allow filter packets to reach a coordinatorCheck if coordinator firewall settings (open ports and IP rules) are correct.Coordinator node is not runningCheck if coordinator is really running.Problem: filtering node doesn't receive filtering rules6.2. Problem: filtering node doesn't receive filtering rulesPossible reasonHow to solve a problemAny of coordinator's node IP settings of a filter node are not correct or were not set.Check for a correct IP and port numbers (see above problem's first solution).Errors in filtering languageCheck coordinator's log file for errorsIssues with network connectivity or software usedCheck coordinator's log file for errors; check node firewall settingsProblem: filtering node doesn't receive data6.3. Problem: filtering node doesn't receive dataPossible reasonHow to solve a problemLoadbalancer is not runningCheck for errors in loadbalancer log filesPorts are close or filtered by firewallCheck node firewall settingsThere are no actual data receivedCheck loadbalancer log file of transmitted dataProblem: loadbalancer doesn't receive any data6.4. Problem: loadbalancer doesn't receive any dataPossible reasonHow to solve a problemLoadbalancer is not runningCheck if loadbalancer is running and check for errors in loadbalancer's log files.Ports are close or filtered by firewallCheck loadbalancer firewall settingsProblem: Filter produces incorrect results6.5. Problem: Filter produces incorrect resultsPossible reasonHow to solve a problemIncorrect filter initial setupRun node with higher level of verbosity: start them with ./stdf_node.sh -vvv and then check log files for possible issuesIncorrect filter rulesRun filter language parser and validate it's actual syntax: run ./stdf_parser.sh --validate [rulefile1]Technology stack behind this sample documentA. Technology stack behind this sample documentThe source files of this document:were completely written in DocBook/XML 5.111 https://docbook.org/xml/5.1/ format which is OASIS Standard22 https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=docbook;were WYSYWYM-authored by using of XMLmind XML Editor33 http://www.xmlmind.com/xmleditor/ version 7.3 by XMLmind Software44 http://www.xmlmind.com installed on author's desktop running Debian GNU/Linux 10.11 (buster)55 https://www.debian.org/. Also author used Dia Diagram Editor66 http://dia-installer.de/ for diagrams.are freely available at Github as a docbook-samples project77 https://github.com/eduardtibet/docbook-samples;are distributed under Creative Commons License - for details see License.To produce .fo file of this document the following software were used:The local copy of DocBook XSL Stylesheets v. 1.79.188 http://docbook.sourceforge.net/release/xsl/ was used.Author's customization layer of the above stylesheets that is now a docbook pretty playout99 https://github.com/eduardtibet/docbook-pretty-playout project, freely available at Github.xsltproc as an engine to produce .fo file from the DocBook source .xml file (xsltproc compiled against libxml 20904, libxslt 10129 and libexslt 817).To get the result .pdf file from a .fo file author used Apache FOP 2.31010 http://xmlgraphics.apache.org/fop/ engine with a foponts project1111 https://github.com/eduardtibet/foponts, created and maintained by the author of this document.LicenseB. LicenseThis work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License11 https://creativecommons.org/licenses/by-nc-sa/4.0/.