Quantcast
Channel: Rainmeter Forums
Viewing all articles
Browse latest Browse all 752

Lua Scripting • Convert HTML/XML to Lua Table (and vice versa)

$
0
0
I wanted to keep these functions short but given that I'm still a lua newbie they got big so I thought this would no longer fit the Tips and Tricks thread.
Decided to make it into a module since it might be expanded in the future, of course you can simply copy and paste any function into your script and rename them if you don't want the full thing.

As I mentioned, I'm still a newbie so improvements are very welcomed (and asked for).

HTML2Table.lua allows to convert HTML and XML into Lua tables, and vice versa. It also supports data extraction and works with htmlEntities.lua for encoding/decoding.


Testing Skin:
(includes a modified version of htmlEntities that adds encode symbols and html sensitive symbols only options)

html2table_1.0.rmskin

Lua only:

HTML2Table.lua

Features:
  • HTML to Lua Table: Parse HTML code into a structured Lua table.
  • Lua Table to HTML: Convert Lua tables back into HTML or XML.
  • Works with XML: Supports both HTML and XML formats.
  • HTML Entities Support: Compatible with htmlEntities.lua module for encoding and decoding.
  • Data Extraction: Extract string data from Lua tables.

Functions:

1. HTML2Table.toTable()
Converts HTML/XML code to a structured Lua table.

Parameters:
  • html (string, required): The HTML/XML code to convert.
  • decode (boolean, optional): Whether to decode the code (requires htmlEntities.lua).
Example output:

Code:

{  [1] = {    attributes = {      class = "tb1",    },    content = {      [1] = {        content = {          [1] = {            attributes = {              colspan = "4",            },            content = {              [1] = "Date and Time (Universal Time)",            },            tag = "th",          },        },        tag = "tr",      },      [2] = {        content = {          [1] = {            content = {              [1] = "New Moon",            },            tag = "th",          },          [2] = {            content = {              [1] = "First Quarter",            },            tag = "th",          },          [3] = {            content = {              [1] = "Full Moon",            },            tag = "th",          },          [4] = {            content = {              [1] = "Last Quarter",            },            tag = "th",          },        },        tag = "tr",      },      [3] = {        content = {          [1] = {            content = {              [1] = "--",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Jan 6 23:56",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Jan 13 22:27",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Jan 21 20:31",            },            tag = "td",          },        },        tag = "tr",      },      [4] = {        content = {          [1] = {            content = {              [1] = "2025 Jan 29 12:36",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Feb 5 08:02",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Feb 12 13:53",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Feb 20 17:32",            },            tag = "td",          },        },        tag = "tr",      },      [5] = {        content = {          [1] = {            content = {              [1] = "2025 Feb 28 00:45",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Mar 6 16:31",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Mar 14 06:55",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Mar 22 11:29",            },            tag = "td",          },        },        tag = "tr",      },      [6] = {        content = {          [1] = {            content = {              [1] = "2025 Mar 29 10:58",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Apr 5 02:15",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Apr 13 00:22",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Apr 21 01:35",            },            tag = "td",          },        },        tag = "tr",      },      [7] = {        content = {          [1] = {            content = {              [1] = "2025 Apr 27 19:31",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 May 4 13:52",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 May 12 16:56",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 May 20 11:59",            },            tag = "td",          },        },        tag = "tr",      },      [8] = {        content = {          [1] = {            content = {              [1] = "2025 May 27 03:02",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Jun 3 03:41",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Jun 11 07:44",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Jun 18 19:19",            },            tag = "td",          },        },        tag = "tr",      },      [9] = {        content = {          [1] = {            content = {              [1] = "2025 Jun 25 10:31",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Jul 2 19:30",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Jul 10 20:37",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Jul 18 00:38",            },            tag = "td",          },        },        tag = "tr",      },      [10] = {        content = {          [1] = {            content = {              [1] = "2025 Jul 24 19:11",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Aug 1 12:41",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Aug 9 07:55",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Aug 16 05:12",            },            tag = "td",          },        },        tag = "tr",      },      [11] = {        content = {          [1] = {            content = {              [1] = "2025 Aug 23 06:06",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Aug 31 06:25",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Sep 7 18:09",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Sep 14 10:33",            },            tag = "td",          },        },        tag = "tr",      },      [12] = {        content = {          [1] = {            content = {              [1] = "2025 Sep 21 19:54",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Sep 29 23:54",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Oct 7 03:47",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Oct 13 18:13",            },            tag = "td",          },        },        tag = "tr",      },      [13] = {        content = {          [1] = {            content = {              [1] = "2025 Oct 21 12:25",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Oct 29 16:21",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Nov 5 13:19",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Nov 12 05:28",            },            tag = "td",          },        },        tag = "tr",      },      [14] = {        content = {          [1] = {            content = {              [1] = "2025 Nov 20 06:47",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Nov 28 06:59",            },            tag = "td",          },          [3] = {            content = {              [1] = "2025 Dec 4 23:14",            },            tag = "td",          },          [4] = {            content = {              [1] = "2025 Dec 11 20:52",            },            tag = "td",          },        },        tag = "tr",      },      [15] = {        content = {          [1] = {            content = {              [1] = "2025 Dec 20 01:43",            },            tag = "td",          },          [2] = {            content = {              [1] = "2025 Dec 27 19:10",            },            tag = "td",          },          [3] = {            content = {              [1] = "2026 Jan 3 10:03",            },            tag = "td",          },          [4] = {            content = {              [1] = "--",            },            tag = "td",          },        },        tag = "tr",      },    },    tag = "table",  },}
2. HTML2Table.toHTML()
Converts a structured Lua table back into HTML/XML.

Parameters:
  • tbl (table, required): The Lua table to convert.
  • encode (boolean, optional): Whether to encode the content (requires htmlEntities.lua).
Example output:

Code:

<table class="tb1">    <tr>        <th colspan="4">Date and Time (Universal Time)</th></tr>    <tr>        <th>New Moon</th>        <th>First Quarter</th>        <th>Full Moon</th>        <th>Last Quarter</th></tr>    <tr>        <td>--</td>        <td>2025 Jan 6 23:56</td>        <td>2025 Jan 13 22:27</td>        <td>2025 Jan 21 20:31</td></tr>    <tr>        <td>2025 Jan 29 12:36</td>        <td>2025 Feb 5 08:02</td>        <td>2025 Feb 12 13:53</td>        <td>2025 Feb 20 17:32</td></tr>    <tr>        <td>2025 Feb 28 00:45</td>        <td>2025 Mar 6 16:31</td>        <td>2025 Mar 14 06:55</td>        <td>2025 Mar 22 11:29</td></tr>    <tr>        <td>2025 Mar 29 10:58</td>        <td>2025 Apr 5 02:15</td>        <td>2025 Apr 13 00:22</td>        <td>2025 Apr 21 01:35</td></tr>    <tr>        <td>2025 Apr 27 19:31</td>        <td>2025 May 4 13:52</td>        <td>2025 May 12 16:56</td>        <td>2025 May 20 11:59</td></tr>    <tr>        <td>2025 May 27 03:02</td>        <td>2025 Jun 3 03:41</td>        <td>2025 Jun 11 07:44</td>        <td>2025 Jun 18 19:19</td></tr>    <tr>        <td>2025 Jun 25 10:31</td>        <td>2025 Jul 2 19:30</td>        <td>2025 Jul 10 20:37</td>        <td>2025 Jul 18 00:38</td></tr>    <tr>        <td>2025 Jul 24 19:11</td>        <td>2025 Aug 1 12:41</td>        <td>2025 Aug 9 07:55</td>        <td>2025 Aug 16 05:12</td></tr>    <tr>        <td>2025 Aug 23 06:06</td>        <td>2025 Aug 31 06:25</td>        <td>2025 Sep 7 18:09</td>        <td>2025 Sep 14 10:33</td></tr>    <tr>        <td>2025 Sep 21 19:54</td>        <td>2025 Sep 29 23:54</td>        <td>2025 Oct 7 03:47</td>        <td>2025 Oct 13 18:13</td></tr>    <tr>        <td>2025 Oct 21 12:25</td>        <td>2025 Oct 29 16:21</td>        <td>2025 Nov 5 13:19</td>        <td>2025 Nov 12 05:28</td></tr>    <tr>        <td>2025 Nov 20 06:47</td>        <td>2025 Nov 28 06:59</td>        <td>2025 Dec 4 23:14</td>        <td>2025 Dec 11 20:52</td></tr>    <tr>        <td>2025 Dec 20 01:43</td>        <td>2025 Dec 27 19:10</td>        <td>2026 Jan 3 10:03</td>        <td>--</td></tr></table>
3. HTML2Table.getStrings()
Extracts string data from a structured Lua table.

Parameters:
  • tbl (table, required): The Lua table from which to extract strings.
  • tags (table, optional): A list of tags (e.g., {'th', 'td', 'p', 'a', 'thead'}).
  • mode (string, optional, default: 'exclude'): Extraction mode:
    • 'include': Extract strings only from the specified tags.
    • 'exclude': Extract strings from all tags except those specified.
  • returnTags (boolean, optional, default: false): Determines output:
    • true: Return the non excluded tags themselves.
    • false: Return the string content of the non excluded tags.
  • wContentOnly (boolean, optional, default: true): Determines filtering:
    • true: Return only non excluded tags with content.
    • false: Return all non excluded tags.
Example Outputs:
returnTags=true, wContentOnly=false

Code:

{  [1] = "tr",  [2] = "table",  [3] = "th",  [4] = "td",}
returnTags=true, wContentOnly=true

Code:

{  [1] = "th",  [2] = "td",}
returnTags=false, wContentOnly=false

Code:

{  [1] = "Date and Time (Universal Time)",  [2] = "New Moon",  [3] = "First Quarter",  [4] = "Full Moon",  [5] = "Last Quarter",  [6] = "--",  [7] = "2025 Jan 6 23:56",  [8] = "2025 Jan 13 22:27",  [9] = "2025 Jan 21 20:31",  [10] = "2025 Jan 29 12:36",  [11] = "2025 Feb 5 08:02",  [12] = "2025 Feb 12 13:53",  [13] = "2025 Feb 20 17:32",  [14] = "2025 Feb 28 00:45",  [15] = "2025 Mar 6 16:31",  [16] = "2025 Mar 14 06:55",  [17] = "2025 Mar 22 11:29",  [18] = "2025 Mar 29 10:58",  [19] = "2025 Apr 5 02:15",  [20] = "2025 Apr 13 00:22",  [21] = "2025 Apr 21 01:35",  [22] = "2025 Apr 27 19:31",  [23] = "2025 May 4 13:52",  [24] = "2025 May 12 16:56",  [25] = "2025 May 20 11:59",  [26] = "2025 May 27 03:02",  [27] = "2025 Jun 3 03:41",  [28] = "2025 Jun 11 07:44",  [29] = "2025 Jun 18 19:19",  [30] = "2025 Jun 25 10:31",  [31] = "2025 Jul 2 19:30",  [32] = "2025 Jul 10 20:37",  [33] = "2025 Jul 18 00:38",  [34] = "2025 Jul 24 19:11",  [35] = "2025 Aug 1 12:41",  [36] = "2025 Aug 9 07:55",  [37] = "2025 Aug 16 05:12",  [38] = "2025 Aug 23 06:06",  [39] = "2025 Aug 31 06:25",  [40] = "2025 Sep 7 18:09",  [41] = "2025 Sep 14 10:33",  [42] = "2025 Sep 21 19:54",  [43] = "2025 Sep 29 23:54",  [44] = "2025 Oct 7 03:47",  [45] = "2025 Oct 13 18:13",  [46] = "2025 Oct 21 12:25",  [47] = "2025 Oct 29 16:21",  [48] = "2025 Nov 5 13:19",  [49] = "2025 Nov 12 05:28",  [50] = "2025 Nov 20 06:47",  [51] = "2025 Nov 28 06:59",  [52] = "2025 Dec 4 23:14",  [53] = "2025 Dec 11 20:52",  [54] = "2025 Dec 20 01:43",  [55] = "2025 Dec 27 19:10",  [56] = "2026 Jan 3 10:03",  [57] = "--",}
returnTags=false, wContentOnly=true

Code:

{  [1] = "Date and Time (Universal Time)",  [2] = "New Moon",  [3] = "First Quarter",  [4] = "Full Moon",  [5] = "Last Quarter",  [6] = "--",  [7] = "2025 Jan 6 23:56",  [8] = "2025 Jan 13 22:27",  [9] = "2025 Jan 21 20:31",  [10] = "2025 Jan 29 12:36",  [11] = "2025 Feb 5 08:02",  [12] = "2025 Feb 12 13:53",  [13] = "2025 Feb 20 17:32",  [14] = "2025 Feb 28 00:45",  [15] = "2025 Mar 6 16:31",  [16] = "2025 Mar 14 06:55",  [17] = "2025 Mar 22 11:29",  [18] = "2025 Mar 29 10:58",  [19] = "2025 Apr 5 02:15",  [20] = "2025 Apr 13 00:22",  [21] = "2025 Apr 21 01:35",  [22] = "2025 Apr 27 19:31",  [23] = "2025 May 4 13:52",  [24] = "2025 May 12 16:56",  [25] = "2025 May 20 11:59",  [26] = "2025 May 27 03:02",  [27] = "2025 Jun 3 03:41",  [28] = "2025 Jun 11 07:44",  [29] = "2025 Jun 18 19:19",  [30] = "2025 Jun 25 10:31",  [31] = "2025 Jul 2 19:30",  [32] = "2025 Jul 10 20:37",  [33] = "2025 Jul 18 00:38",  [34] = "2025 Jul 24 19:11",  [35] = "2025 Aug 1 12:41",  [36] = "2025 Aug 9 07:55",  [37] = "2025 Aug 16 05:12",  [38] = "2025 Aug 23 06:06",  [39] = "2025 Aug 31 06:25",  [40] = "2025 Sep 7 18:09",  [41] = "2025 Sep 14 10:33",  [42] = "2025 Sep 21 19:54",  [43] = "2025 Sep 29 23:54",  [44] = "2025 Oct 7 03:47",  [45] = "2025 Oct 13 18:13",  [46] = "2025 Oct 21 12:25",  [47] = "2025 Oct 29 16:21",  [48] = "2025 Nov 5 13:19",  [49] = "2025 Nov 12 05:28",  [50] = "2025 Nov 20 06:47",  [51] = "2025 Nov 28 06:59",  [52] = "2025 Dec 4 23:14",  [53] = "2025 Dec 11 20:52",  [54] = "2025 Dec 20 01:43",  [55] = "2025 Dec 27 19:10",  [56] = "2026 Jan 3 10:03",  [57] = "--",}
Same result as it will always return only those that have string content when returnTags=false.
Note on <!--Comments-->: The toTable function will store comments as a special tag='comment' and their content as content='their content'. The toHTML function will recognize these tags and will print them as <!--Comment Content-->

Anything that is not a valid <tag> will be stored in content as a string. For example scripts should be stored as plain string in the content of the <script> tag.
Eg.{ tag=script, content={'all scripts'} }.

Void tags are stored normally, however toHTML will return them with a closing tag. Eg. <br></br>.

HTML2Table.lua code:
Note: Comments are ChatGPT generated :Whistle

Code:

--[[HTML2Table is a module that allows to convert HTML code to Lua tables and the other way around.-by RicardoTM.For encoding and decoding htmlEntities.lua by TiagoDanin module is required: https://github.com/TiagoDanin/htmlEntities-for-lua More info and questions: https://forum.rainmeter.net/viewtopic.php?t=44955#p231448 ]]local HTML2Table = {}-- Extract HTML/XML into a structured tablefunction HTML2Table.toTable(html, decode)    -- Converts an HTML string into a nested Lua table representation.    -- Parameters:    -- html: A string containing HTML content (tags, attributes, comments, and text).    -- decode: (Optional) A boolean to indicate whether to decode HTML character references.    -- Returns:    -- A nested Lua table representing the structure of the HTML content. For example:    -- {    --   {tag="div", attributes={class="container"}, content={"Some text", {tag="span", content={"nested"}}}}    -- }    if not html then return error('No HTML code found') end    local decode = decode or false     local idx = 1 -- Initialize current index for parsing    local tbl = {} -- Initialize the table to hold the parsed HTML structureif decode and not htmlEntities then print('Warning: htmlEntities.lua is required to decode.')end    -- Function to trim leading and trailing whitespace from a string    local function trim(s)        return (s:gsub("^%s+", ""):gsub("%s+$", ""))    end    html = trim(html) -- Remove excess whitespace from the input HTML string    -- Function to parse HTML comments    local function parseComment(startIndex)        -- Looks for comments in the format <!-- comment content -->        local commentStart, commentEnd = html:find("<!%-%-(.-)%-%->", startIndex)        if commentStart and commentEnd then            local commentContent = html:sub(commentStart + 4, commentEnd - 3) -- Extract comment content            return {tag = "comment", content = {trim(commentContent)}}, commentEnd + 1        end        return nil, startIndex     end    -- Function to parse an HTML tag and its content    local function parseTag(startIndex)        -- Find the start of an opening tag        local tagStart = html:find("<([^/][^>]*)", startIndex)        if not tagStart then return nil, #html + 1 end -- End parsing if no more tags are found        -- Check if the tag is a comment        if html:sub(tagStart, tagStart + 3) == "<!--" then            return parseComment(startIndex) -- Delegate to comment parser        end        -- Find the end of the opening tag        local tagEnd = html:find(">", startIndex)        if not tagEnd then return nil, #html + 1 end -- End parsing if no tag closure is found        -- Extract the tag name        local tagName = html:sub(tagStart + 1, tagEnd - 1):match("([%w:_%-%.]+)")        if not tagName then return nil, tagEnd + 1 end -- Skip malformed tags        -- Initialize a table to represent the tag        local tagData = {tag = tagName, content = {}}        -- Extract attributes from the tag        local attributesStr = html:sub(tagStart + 1, tagEnd - 1):match("%s(.+)")        if attributesStr then            tagData.attributes = {}            for attr, value in attributesStr:gmatch("([a-zA-Z0-9:_-.]+)%s*=%s*['\"]([^\"]+)['\"]") do                tagData.attributes[attr] = value -- Store attributes as key-value pairs            end        end        -- Find the closing tag and handle nested content        local nextPos = tagEnd + 1        local closingTagPattern = "</" .. tagName .. ">"        local closingTagStart = html:find(closingTagPattern, nextPos)        if not closingTagStart then return tagData, nextPos end -- If no closing tag, return tag as is        -- Process the content inside the tag        while closingTagStart do            local nextTagStart = html:find("<([^/][^>]*)", nextPos)            if nextTagStart and nextTagStart < closingTagStart then                -- Add text content before the next nested tag                local textBeforeTag = html:sub(nextPos, nextTagStart - 1)                if textBeforeTag:match("%S") and not textBeforeTag:match("<.*>")  then                    if decode and htmlEntities then                        textBeforeTag = htmlEntities.decode(textBeforeTag) -- Optionally decode references                    end                    table.insert(tagData.content, trim(textBeforeTag))                end                -- Parse the nested tag                local nestedTagData, newPos = parseTag(nextTagStart)                if nestedTagData then                    table.insert(tagData.content, nestedTagData)                    nextPos = newPos                else                    nextPos = newPos                end            else                -- Add text content before the closing tag                local textContent = html:sub(nextPos, closingTagStart - 1)                if textContent:match("%S") and not textContent:match("<.*>") then                    if decode and htmlEntities then                        textContent = htmlEntities.decode(textContent)                    end                    table.insert(tagData.content, trim(textContent))                end                break            end        end        return tagData, closingTagStart + #closingTagPattern -- Return parsed tag and position    end    -- Main loop to process the entire HTML string    while idx <= #html do        local tagData, newPos = parseTag(idx) -- Parse the next tag        if tagData then            table.insert(tbl, tagData) -- Add parsed tag to the result table        end        if newPos > #html then break end -- Exit loop if end of string is reached        idx = newPos -- Move index to the next position    end    return tbl end-- Convert a structured Lua table into an HTML/XML stringfunction HTML2Table.toHTML(tbl, encode)    -- Converts a nested Lua table representation of HTML into an HTML-formatted string.    -- Parameters:    -- tbl: A table with a structure like:    --   {    --     {tag="th", attributes={colspan=2}, content={"text", {tag="span", content={"nested text"}}}}    --   }    -- encode: (Optional) A boolean to indicate whether to encode special HTML characters in text content.    -- Returns:    -- A string containing HTML with properly formatted tags, attributes, and content.    if not tbl then return error('HTML2Table.toHTML(): No table found') end -- Ensure a valid table is provided    local encode = encode or false     local indentLevel = 0 -- Tracks the current level of indentation for better formatting    local indent = string.rep("    ", indentLevel) -- Base indentation stringif encode and not htmlEntities then print('Warning: htmlEntities.lua is required to encode.')end    -- Function to convert a table of attributes into a string    local function processAttributes(attributes)        -- Parameters:        -- attributes: A table of key-value pairs representing HTML attributes (e.g., {class="btn", id="submit"}).        -- Returns:        -- A string formatted as key-value pairs for use in an HTML tag (e.g., ' class="btn" id="submit"').        local attrStr = ""        if attributes then            for key, value in pairs(attributes) do                attrStr = attrStr .. string.format(' %s="%s"', key, value) -- Concatenate attributes            end        end        return attrStr    end    -- Function to process the content of a tag    local function processContent(content, level)        -- Parameters:        -- content: The content of the tag, which can be text, nested tables, or a mix.        -- level: The current indentation level for formatting.        -- Returns:        -- A string representing the processed HTML content.        local html = ""        if type(content) == "table" then            -- Iterate over the content table, which may contain strings or nested tags            for _, item in ipairs(content) do                if type(item) == "table" and item.tag then                    -- Handle table entries with a "tag" field (i.e., HTML elements)                    if item.tag == "comment" then                        -- Special handling for comments (e.g., <!--comment content-->)                        local commentContent = table.concat(item.content or {}, "")                        html = html .. string.format(                            '\n%s<!--%s-->\n',                            string.rep("    ", level), -- Indent the comment based on its nesting level                            commentContent                        )                    else                        -- Handle regular HTML tags (e.g., <div>, <span>)                        local attributes = processAttributes(item.attributes) -- Process attributes                        local innerContent = processContent(item.content, level + 1) -- Process nested content                        html = html .. string.format(                            '\n%s<%s%s>%s</%s>\n',                            string.rep("    ", level), -- Indent the tag based on its nesting level                            item.tag, -- Tag name (e.g., "div")                            attributes, -- Attributes string (e.g., ' class="example"')                            innerContent, -- Inner content (e.g., text or more tags)                            item.tag -- Closing tag (e.g., </div>)                        )                    end                elseif type(item) == "string" then                    -- Handle plain text content                    if encode and htmlEntities then                        item = htmlEntities.encode(item) -- Optionally encode special HTML characters                    end                    html = html .. item -- Append the text content                end            end        end        return html     end    -- Start processing the input table from the top level    local html = processContent(tbl, indentLevel)    return html -- Return the final HTML stringend-- Convert a structured table to a list of strings.function HTML2Table.getStrings(tbl, tags, mode, returnTags, wContentOnly)    --[[ Parameters:tbl: A table containing nested tag data with the structure:{ {tag="tagName", attributes={atribute=value}, content={"text", {tag="nestedTag", content={"nested text"}}}}, ... }tags: (Optional) A list of tags to include or exclude, e.g., {"p", "span"}.mode: (Optional) Specifies whether tags are included ("include") or excluded ("exclude"). Default is "exclude".returnTags: (Optional) Boolean indicating whether to return the tags themselves (true) or their content (false).wContentOnly: (Optional) Boolean indicating whether to return only tags that have string content (default is false).Return:Table containing extracted strings based on the specified mode and options. ]]    local result = {}     tags = tags or {}     mode = mode or "exclude"     returnTags = returnTags or false     wContentOnly = wContentOnly or true -- Splits a string into a table of substrings, using commas as delimiterslocal function splitToTable(str)-- Taken from: https://forum.rainmeter.net/viewtopic.php?p=231310#p231308if type(str) ~= 'string' then return error("splitToTable(): Input has to be a string. Not a "..type(str)) endlocal fields = {}for field in str:gmatch('([^,%s*]+)') dofields[#fields + 1] = fieldendreturn fieldsend--Converts strings 'true' or 'false' to boolean true or falselocal function trueOrFalse(str)local str = str or 'false'if str == 'true' then return trueelseif str == 'false' then return falseelse return error("trueOrFalse(): String has to be either 'true' or 'false', '"..str.."' is invalid.")endendif type(tags) ~= 'table' thentags = splitToTable(tags)endif type(returnTags) ~= 'boolean' thenreturnTags = trueOrFalse(returnTags)endif type(wContentOnly) ~= 'boolean' thenwContentOnly = trueOrFalse(wContentOnly)end    -- Convert the tags list to a set for quick lookup    local tagsSet = {}    for _, tag in ipairs(tags) do        tagsSet[tag] = true    end    -- Recursive function to process content from nested tags    local function processContent(tagData)        -- Parameters:        -- tagData: A table representing a tag with potential nested content.        if tagData.content then             local tagMatches = (mode == "include" and tagsSet[tagData.tag]) or                               (mode == "exclude" and not tagsSet[tagData.tag])            local hasStringContent = false            for _, contentItem in ipairs(tagData.content) do                if type(contentItem) == "string" then                    hasStringContent = true                    if tagMatches and not returnTags then                        table.insert(result, contentItem)                    end                elseif type(contentItem) == "table" then                    processContent(contentItem)                end            end            -- If returning tags and the tag matches            if tagMatches and returnTags then                -- Include the tag only if it has string content when `wContentOnly` is true                if not wContentOnly or (wContentOnly and hasStringContent) then                    if not result[tagData.tag] then                        result[tagData.tag] = true                    end                end            end        end    end    -- Iterate through the main table and process each tag    for _, tagData in ipairs(tbl) do        processContent(tagData)    end    -- If returning tags, convert the set to a list    if returnTags then        local tagsList = {}        for tag in pairs(result) do            table.insert(tagsList, tag)        end        return tagsList    end    return result endreturn HTML2Table

Statistics: Posted by RicardoTM — Yesterday, 7:40 am — Replies 0 — Views 37



Viewing all articles
Browse latest Browse all 752

Trending Articles