Loading FULL HTML files

Prev Next

Introduction

With the procedure table load(), B4P loads complete complete HTML files in a similar way as XML files. The entire HTML file will be put into a structured table. The table will contain following header names:

HTML Level Nesting level. Begins with 1
HTML Tag Applied tag. E.g. <Country>; puts "Country" into a new row at column "XML Tag".
HTML Usage <Country> specifies "Start",
</Country> specifies "End", and
<Country/> specifies "Empty".
HTML Contents Payload contents following the tag. E.g. <Country>UK<Country/> specifies "UK". Note: All ine breaks are included in the contents.
HTML Attributes Lists all attribute names (but not the values) referenced in the XML tag
Additional columns:
Atribute names
The table gets additional columns with header names corresponding to the identified attribute names. Whenever attributes are specified in the tags, then the attribute values are listed below these headers.

  table load( t, "Examples\Example.html", TEXT );

  echo("This is the HTML file:");
  table list( t );

  table load( t, "Examples\Example.html", FULL HTML );

  echo; echo;
  echo("This are the loaded contents from the HTML file:");
  table list( t );
This is the HTML file:
    0 : <html>                                                                       
    1 :                                                                              
    2 : <head>                                                                       
    3 : <meta http-equiv=Content-Type content="text/html; charset=windows-1252">     
    4 : <meta name=Generator content="Microsoft Word 15 (filtered)">                 
    5 :                                                                              
    6 : </head>                                                                      
    7 :                                                                              
    8 : <body lang=DE-CH style='word-wrap:break-word'>                               
    9 :                                                                              
   10 : <div class=WordSection1>                                                     
   11 :                                                                              
   12 : <p class=MsoNormal><span lang=EN-US>The quick <b>brown fox</b> jumps over the
   13 : lazy dog</span></p>                                                          
   14 :                                                                              
   15 : </div>                                                                       
   16 :                                                                              
   17 : </body>                                                                      
   18 :                                                                              
   19 : </html>                                                                      
   20 :                                                                              



This are the loaded contents from the HTML file:
    0 : HTML Level | HTML Tag | HTML Usage | HTML Contents   | HTML Attributes    | http-equiv   | content                         | name      | lang  | style                | class       
    1 : 1          | html     | Start      |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
    2 : 2          | head     | Start      |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
    3 : 3          | meta     | Empty      |                 | http-equiv,content | Content-Type | text/html; charset=windows-1252 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
    4 : 3          | meta     | Empty      |                 | name,content       |              | Microsoft Word 15 (filtered)    | Generator |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
    5 : 2          | head     | End        |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
    6 : 2          | body     | Start      |                 | lang,style         |              |                                 |           | DE-CH | word-wrap:break-word |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
    7 : 3          | div      | Start      |                 | class              |              |                                 |           |       |                      | WordSection1
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
    8 : 4          | p        | Start      |                 | class              |              |                                 |           |       |                      | MsoNormal   
    9 : 5          | span     | Start      | The quick       | lang               |              |                                 |           | EN-US |                      |             
   10 : 6          | b        | Start      | brown fox       |                    |              |                                 |           |       |                      |             
   11 : 6          | b        | End        |  jumps over the |                    |              |                                 |           |       |                      |             
      :            |          |            | lazy dog        |                    |              |                                 |           |       |                      |             
   12 : 5          | span     | End        |                 |                    |              |                                 |           |       |                      |             
   13 : 4          | p        | End        |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
   14 : 3          | div      | End        |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
   15 : 2          | body     | End        |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
      :            |          |            |                 |                    |              |                                 |           |       |                      |             
   16 : 1          | html     | End        |                 |                    |              |                                 |           |       |                      |             

Try it yourself: Open LIB_Features_Loading_FULL_HTML_files.b4p in B4P_Examples.zip. Decompress before use.

See also

table load
Loading XML files
Loading JSON files