category-group: strings
layer: 3
header file(s): z_allstrings.h
libraries: libz00.lib libz01.lib libz02.lib libz03.lib synopsis.
layer 3 string subroutines have to do with number-to-string conversion and text representation of numbers. [C] functions (aka subroutines):
z_num_words()
SIGNATURE: count_t z_num_words (const string_o &str, int *pi)
SYNOPSIS: calculate the number of "words" in the given string 'str'
DESCRIPTION: given a string such as "Hark, the cannon booms!", this will return 4.
RETURNS: [n >= 0]: number of words in 'str'
z_extract_leading_tabs()
SIGNATURE: count_t z_extract_leading_tabs (const textstring_o &sin, textstring_o &sot, count_t n = 1, int *pi = NULL)
SYNOPSIS: this function is a string line filter that takes out leading TAB (ASCII 0x8) characters.
PARAMETERSDESCRIPTION:
- sin: the input text string
- sot: the output text string
- n: the number of tab characters to remove
given a multi-line text string object 'sin' where the first character of each line is one or more tabs, this function will return a string that has the leading tab deleted (if n is 1). If there are 'n' tabs at the start of a line, they will be excised from the string. If the start of a line contains more than 'n' tab chacaters (ASCII 8), only 'n' tabs will be removed.
This function is a specialization of z_extract_leading_chars(). All the behaviours of this function are the same as that function (which see for more information).
EXAMPLE:
given a string "s" such as "Only once in a while,\n\tperhaps even today,\n\t\tthere is a blue moon."
The resultant string "s2" after z_extract_leading_tabs(s, s2, 1); would be:Only once in a while, perhaps even today, there is a blue moon.
RETURNS: [n=>0]: the number of lines found in 'sin'
z_extract_leading_chars()
SIGNATURE: count_t z_extract_leading_chars (const textstring_o &s, textstring_o &s_out, char c, count_t n, int *pi = NULL)
SYNOPSIS:
This function is a generalization of z_extract_leading_tabs() and is used by that function. It extracts up to 'n' characters 'c' found at the start of each line in 's'.
PARAMETERSDESCRIPTION:
- s: the input text
- s_out: the corresponding text of 's', processed
- c: the character to extract. only characters matchineg 'c' at the start of a line are affected.
- n: the [maximum] number of characters matcheing 'c' to extract.
- pi: [output] error indicator variable. values:
0: ok; success
zErr_DataBadFormat: line encountered not starting with 'n' 'c' characters
This function is intended to be used on multi-line text where the line is pre-padded with a specific character. The intent is to take text that would normally be indented by a tab character and strip the tabs, so that the block of text can then be further processed.
The input string 's' must have at least 'n' occurances of 'c' at the start of each line. If not, it is an error condition and processing is aborted. Example:string_o sot, sin("XXTC, or not 2B\nXThat's a Q.\nXXXWhether 'tis nobler..\n"); count_t num_lines = z_extract_leading_chars(sin, sot, 'X', 1); std::cout << "Hamlet said:\n" << sot;In this example, z_extract_leading_chars() returns 3 and would produce the following output:Hamlet said: XTC, or not 2B That's a Q. XXWhether 'tis nobler..Remember that the individual lines of text inside 's' must begin with [at least 'n' of] the target character in question ('c').
RETURNS: [n >= 0]: the number of lines encountered - processed
z_extract_leading_pattern()
SIGNATURE: int z_extract_leading_pattern (const string_o &patt, string_o &s, int *pi = NULL)
SYNOPSIS:
This function takes a regular expression in 'patt', and applies it to each line in 's'. If the pattern matches the start of the line, it is removed.
TRAITS:
This function appears to work on only a single line (multi-line text not affected beyond the first instance). This function is slated for upgrade and should not be used.
z_extract_strs()
SIGNATURE: void z_extract_strs (string_o text, const string_o &pattern, vlist_o&words)
SYNOPSIS: This function "finds all entries of the given regular expression and adds them to the given list [words]"
DESCRIPTION: The origin or purpose of this function is unknown.
TRAITS:
This function appears to be an unused relic. It should be avoided and may change drastically or disappear in the future.
z_extract_pattern()
SIGNATURE: int z_extract_pattern (const string_o &patt, string_o &s, int *pi = NULL)
SYNOPSIS:
This function is like string_o::search_destroy(). It searches for a regular expression pattern 'patt' in the string 's', and if found, removes the resultant substring from 's'.
PARAMETERSRETURNS: 0
- patt: a string containing a regular expression pattern.
- s: the string to process. All ocurrances of 'patt' will be removed from 's'.
z_find_anycase()
SIGNATURE: size_t z_find_anycase (const string_o &target, const string_o &s, int *pie, string_o::z_StrCase mix = string_o::z_Str_AnyCase)
SYNOPSIS:
This function searches for the first ocurrance of fixed string 'target' in 's'. if 'mix' is set to string_o::z_Str_AnyCase, any case (upper, lower, mixed) of 'target' in 's' will be matched.
TRAITS: this function needs further QA testing and is currently considered unreliable.
z_worded_number()
SIGNATURE: string_o z_worded_number (count_t x, int *pi, boolean add_commas)
SYNOPSIS: This function returns the value of x printed out as words.
PARAMETERSEXAMPLE:
- x: the integer value
- pi: [output] error indicator variable. always 0
- add_commas: if TRUE, a value of 'x' such as 1500 will return "one thousand, five hundred". If FALSE, the returned string will be "one thousand five hundred".
Here are some inputs and outputs (no-commas mode):2000 "two thousand" 458 "four hundred fifty eight" 1524 "one thousand five hundred twenty four"
RETURNS: [string object] string containing the english words equivalent for the value of 'x'.
z_numeric_digits()
SIGNATURE: string_o z_numeric_digits (const string_o &numeric, int *pexi = NULL)
SYNOPSIS: This function converts "common numerics" to a string of digits.
DESCRIPTION:
This function is related to z_worded_number(). It processes a string representation of a number, such as "4th" or "two thousand" and returns a string containing the pure-number equivalent (as a string). In the two examples here, it would return "4" and "2000", respectively.
z_num_to_ordstring()
SIGNATURE: string_o z_num_to_ordstring (count_t n, int *pi = NULL)
SYNOPSIS: This function
PARAMETERSDESCRIPTION:
- n: the value to print
- pi: [output] error indicator variable. returns 0, always.
This function is related to z_worded_number() and z_numeric_digits(). It returns a string representing the ordered value of 'n', as English text. For example, z_num_to_ordstring(21) returns "21st".
z_ordinal_suffix()
SIGNATURE: string_o z_ordinal_suffix (count_t i, int *pi = NULL)
SYNOPSIS: return a string that is the ending-suffix for the value3 of 'i'
DESCRIPTION:
z_ordinal_suffix(1) returns "st"; z_ordinal_suffix(2) returns "nd"; z_ordinal_suffix(3) returns "rd", and so on
z_breakoff_sentences()
SIGNATURE: string_o z_breakoff_sentences (string_o &s, const size_t n, const boolean eat, int *pi = NULL)
SYNOPSIS: This subroutine extracts sentences from a string, up to 'n' bytes.
DESCRIPTION:
This subroutine retrieves text in 's', from the first character (position 0) up to 'n'. If the length of 's' is less than 'n', the entire string is returned. if 'do_eat' is TRUE, the text returned is removed from 's'. It looks for complete sentences. for example:given this: string_o s2, s1 = "That is all. Amen! Got it?"; for (int i=0; i < 3; i++) s2 = z_breakoff_sentences(s1, 15);on the 1st iteration, s2 will have "That is all!"; on the 2nd, s2 has " Amen! Got it?". if we change the max character size from 15 to 12, s2 will have "That is all.", then " Amen!", then " Got it?".
TRAITS:
This subroutine is new [2013] and needs more testing. It does not handle special cases well and can be easily fooled. Quotes are ignored. given 'My quote is.. "What? It can't be!". Such is life. This is the end!', the text may get spilt after "is..." or "What?" when it is desirable to keep the text together there.
z_squishtext()
SIGNATURE: int z_squishtext (textstring_o &s, const flag_o &flag = 0, int *pi = NULL)
SYNOPSIS:
This subroutine "compresses vertically" a multi-line text string (if there is only 1 line in 's', it does nothing). This is intended to clean up funky blocks of text, such as classified ad postings that have been converted from HTML to straight text, and end up with lots of consecutive empty lines.
PARAMETERSDESCRIPTION:
- s: a string object with the text to process. Since this is a non-const reference, the output is also in this object
- flag: a bag of options defining how to process the text. Since this is a bit-mask, or-ing the values will turn on all the corresponding bits. The bit slot meanings:
0: strip all empty lines, not just 2+ consecutive lines
1: process "\r" (Microsoft-style), too
2: [PROPOSED - NOT IMPLEMENTD] trim each line (at the end)
3: [PROPOSED - NOT IMPLEMENTD] pre-trim the lines, too- pi: [output] error indicator variable. returns 0, always.
this subroutine has a number of options, which are specified by the second parameter, 'flag'. The routine will always handle "\n"-only EOL markers (ie, unix-style). You can additionally process the "\r\n" EOL protocol (commonly found in Microsoft). Furthermore, you can "trim" each line, so that trailing or preceding whitespace is culled off each line. Example:textstring_o s("Do you know \nthe way,\n\n\n..to San Jose?"); s += "\r\n\r\nWell do you, punk?"; flag_o f; f.set(1); // do "\r\n\r\n" (1 exists) z_squishtext (s, f);The output, 's', will be this string:Do you know \nthe way,\n\n..to San Jose?\n\r\nWell do you, punk?