category-group: www
layer: 4
header file(s): z_websubs.h
libraries: libz00.lib libz01.lib libz02.lib libz03.lib libz04.lib

synopsis.
There is a small cluster of subroutines aimed at processing a URL:

  • z_str_lookslike_website()
  • is_z_valid_TLD()
  • z_cTLD_to_country()
  • z_strip_htmltags()
  • z_yank_xmlfield_fromtext()
  • z_parse_forminput()
  • z_eatyank_html_attribute()

the term "TLD" on this page refers to Top Level Domain. This is the right-most part of a domain name. The original TLDs include "com", "gov", "mil", "edu", and "org". It also includes abbreviations for countries, such as:

  • kz - Kazakhstan
  • pa - Panama
  • pe - Peru
  • ua - Ukraine
  • vn - Vietnam
These codes almost always match the official 2-character country code but this is not always the case. there are currently (2016) 3 known exceptions:
name ISO code TLD
United Kingdom GB .uk
Saint Martin MF .gp
Saint Barthelemy BL .gp

[C] functions (aka subroutines):

z_str_lookslike_website()
SIGNATURE: boolean z_str_lookslike_website (const string_o &s, zinet_TLD_t &domt, int *pi)
SYNOPSIS: checks if 's' appears to be a valid domain name. the string must not contain any "get" part (stuff like "/foo=bar").
DESCRIPTION:
this routing does not check if the domain actaully exists. it is strictly a syntax validator. the domain name must have at least 1 dot ('.') and must have a valid domain ("TLD") after the last dot.
RETURNS:
TRUE: looks like a domain name
FALSE: otherwise (not a domain name)
 

is_z_valid_TLD()
SIGNATURE: boolean is_z_valid_TLD (const string_o &s, zinet_TLD_t &tld, int *pi)
SYNOPSIS: examines a string containing a domain name. it can be in any case.
PARAMETERS

  • s: [input] a domain name
  • tld: [output] an enumerated type (defined in "z_websubs.h") the TLD type (enumeration). this variable in must exist and be provided in the calling code. it is set to 'zt_TLD_undefined' if no TLD is found.
  • pi: [output] an error indicator output 'flag' variable. values:
    0: TLD is valid, was found to be ok
    zErr_NotFound: no such TLD
    zErr_Impossible_CaseValue: internal error
RETURNS:
TRUE: looks like a TLD
FALSE: otherwise (not a TLD)
WARNING: The TLD
 

z_TLD_from_domainname()
SIGNATURE: string_o z_TLD_from_domainname (const string_o &s, zinet_TLD_t &dom, int *pi = NULL)
SYNOPSIS: returns the TLD code from a domain name
PARAMETERS

  • s: [input] a domain name
  • tld: [output] an enumerated type (defined in "z_websubs.h") the TLD type. this variable must be provided by the calling code. it is set to 'zt_TLD_undefined' if no TLD is found or error.
  • pi: [output] an error indicator output 'flag' variable. values:
    0: TLD is valid, was found to be ok
DESCRIPTION: this is a convenience function that uses other functions in this group to extract the TLD from a string.
RETURNS: string containing the TLD, or empty ("") upon error
 

z_cTLD_to_country()
SIGNATURE: string_o z_cTLD_to_country (const string_o &s, int *pi = NULL)
SYNOPSIS: looks up a TLD code, returns the country name
PARAMETERS

  • s: [input] a TLD
  • pi: [output] an error indicator output 'flag' variable. values:
    0: TLD is valid; was found to be ok
    zErr_TypeIncorrect: the top-level domain is valid, but does not represent a country (eg, ".com")
    zErr_NotFound: the top-level domain was not found
RETURNS: string, contains country name, if successful; empty otherwise
 

z_strip_htmltags()
SIGNATURE: int z_strip_htmltags (string_o &s, int *pi = NULL)
SYNOPSIS: removes all HTML tags from 's'. That is, any "" or "" is excised.
DESCRIPTION:
given text in 's' containing this:

<CENTER>the main sequence.<CENTER> <BR>\n<FONT size=2 color='pink'> To be, or not to be, blah blah (woof woof) </FONT> <HR>
The text in 's' will be changed so that it contains no HTML:
the main sequence. - To be, or not to be, blah blah (woof woof) - -
This function provides a quick way to clean out all HTML from a block of text. It provides to formatting adjustments so that the resultant text is formatted similar to the original text.
RETURNS: 0
 

z_yank_xmlfield_fromtext()
SIGNATURE: int z_yank_xmlfield_fromtext (textstring_o &ts, string_o &nam, string_o &val, boolean munch, int *pi = NULL)
SYNOPSIS:
this subroutine extracts the next "value" construct. given "value-string", this puts "tag" into output variable 'nam' and "value-string" into output variable 'val'.
PARAMETERS

  • ts: [input] an XML-like item
  • nam: [output] the HTML/XML tag name
  • val: [output] the text between the opening and closing tags
  • munch: [input] ?
  • pi: [output] an error indicator output 'flag' variable. values:
    0: success
    zErr_IsEmpty: string 'ts' is empty (0-length)
    zErr_Data_BadFormat: no opening "<" character was found, or the last character was not the closing ">" symbol, or the tag names did not match (or possibly other syntax errors)
RETURNS:
0: success
1: empty string provided
-1: error (see 'pi')
 

z_parse_forminput()
SIGNATURE: int z_parse_forminput (string_o &s, string_o &typ, string_o &nam, string_o &val, string_o &klas, string_o &id, string_o &siz, string_o &och, string_o &ock, string_o &alt, boolean &is_set, int *pi)
SYNOPSIS:
this subroutine parses the contents of string 's', which is expected to be an HTML INPUT item. The contents of 's' must be exactly that. Thus, it must start with a < and end with a >. All attributes of an INPUT are stored into individual (output) parameters. this subroutine might appeal to those programmers who prefer a simple (albeit long) ordered list of function parameters.
PARAMETERS

  • s: [input] a string containing an HTML INPUT item
  • pi: [output] an error indicator output 'flag' variable. values:
    0: good parse
    zErr_Require_Failure: 's' not an "" item zErr_Param_BadVal: name-value pair encountered with an illegal value. this is probably "ischecked=[junk]" (the value must be "true" or "false" only) [other values]: see function z_eatyank_html_attribute()
 

z_eatyank_html_attribute()
SIGNATURE: int z_eatyank_html_attribute (string_o &s, string_o &nam, string_o &val, count_t &n, int *pi)
SYNOPSIS: extracts name-value pairs from a HTML INPUT item
PARAMETERS

  • s: [input] a string containing text like ""
  • n: [input] the number of the attribute to fetch (1st one, n=0)
  • pi: [output] an error indicator output 'flag' variable. values:
    0: successfully extracted the next name-value pair
    zErr_NoData: string is empty
    zErr_OperationFinished: all done; no more data
    zErr_Data_NotTerminated: string not terminated (w. ">")
    zErr_Data_Unexpected: garbage encountered; text is not [name]=[val]
    zErr_Data_BadSyntax: expected "=" but got something else in its place
DESCRIPTION: this subroutine is apparently intended to be used in a loop
WARNING:
this routine is raw and needs further development. currently only the first attribute is processed. setting "n=1", "n=2", etc. will not work.
 

warnings.
many of the routines in this group rely on static, hard-coded tables of countries. These tables can be clearly out of date. The results are not 100% guaranteed.