FBB::String(3) Several operations on std::string objects

SYNOPSIS

#include <bobcat/string>
Linking option: -lbobcat

DESCRIPTION

This class offers facilities for often used transformations on std::string objects, but which are not supported by the std::string class itself. All members of FBB::String are static.

NAMESPACE

FBB
All members, operators and manipulators, mentioned in this man-page, are defined in the namespace FBB.

INHERITS FROM

--

ENUMERATION

o
Type:
This enumeration indicates the nature of the contents of an element in the array returned by the overloaded split members (see below).
DQUOTE, a subset of the characters in the matching string element was delimited by double quotes in the in the string that was parsed by the split members.
DQUOTE_UNTERMINATED, the contents of the string that was parsed by the split members started at some point with a double quote, but the matching ending double quote was lacking.
ESCAPED_END, the contents of the string that was parsed by the split members ended in a mere backslash.
NORMAL, a normal string;
SEPARATOR, a separator;
SQUOTE, a subset of the characters in the matching string element was delimited by quotes in the in the string that was parsed by the split members.
SQUOTE_UNTERMINATED, the contents of the string that was parsed by the split members started at some point with a quote, but the matching ending quote was lacking.

TYPEDEF

The typedef SplitPair represents std::pair<std::string, String::Type> and is used in the second variant of the split member (see below).

HISTORY

Initially this class was derived from std::string. Deriving from std::string, however, is considerd bad design as std::string was not designed as a base-class.

FBB::String offers a series of static member functions providing the facilities originally implemented as non-static members.

STATIC MEMBER FUNCTIONS

o
char const **argv(std::vector<std::string> const &words):
Returns a pointer to an allocated series of pointers to the C strings stored in the vector words. The caller is responsible for returning the array of pointers to the common pool, but should not delete the C-strings to which the pointers point. The last element of the returned array is guaranteed to be a 0-pointer.
o
int casecmp(std::string const &lhs, std::string const &rhs):
Performs a case-insensitive comparison between the two std::string objects. A negative value is returned if lhs should be ordered before rhs; 0 is returned if the two strings have identical contents; a positive value is returned if the lhs object should be ordered beyond rhs.
o
std::string escape(std::string const &str, char const *series = "'\"\\"):
Returns a copy of the str object in which all characters in series are prefixed by a backslash character.
o
std::string lc(std::string const &str) const:
Returns a copy of the str object in which all letters were transformed to lower case letters.
o
std::string trim(std::string const &str):
Returns a copy of the str object from which the leading and trailing blanks have been removed.
o
std::vector<std::string> split(Type *type, std::string const &str, char const *separators = " \t", bool addEmpty = false):
Returns a vector containing the elements in str which are separated from each other by at least one of the characters found in *separators. The member's first parameter points to a Type variable, which will show DQUOTE_UNTERMINATED, SQUOTE_UNTERMINATED, or ESCAPED_END in cases where the contents of str are ill-formed, or NORMAL if str's contents shows not syntactic errors (i.e., ill-formed strings or escape-sequences). If the corresponding argument equals 0 then no Type indication is provided.
If the parameter addEmpty is set to true, then individual separators encountered in str are stored as empty strings in words (e.g., if two elements are separated by three blank spaces, then the returned vector contains three empty strings between the two elements).
If an element in str contains a double quote ("), then all characters from the initial double quote through the matching double quote character are processed as follows: the surrounding double quotes are removed, and the remaining characters are unescaped using the String::unescape member. The resulting unescaped string is added to the element currently under construction. E.g., if str contains
    string="\"hello world\""
        
then the element becomes
    string="hello world"
        

If an element in str contains a single quote ('), then all characters between the initial quote and the matching quote character are literally appended to the element currently under construction. E.g., if str contains
    string='"hello\ world"'
        
then the element becomes
    string="hello\ world"
        

Backslash characters encountered in str outside of single or double quoted strings are unescaped (using String::unescape) and the resulting character is appended to the element currently under construction.
E.g., if str contains
    string=\"hello\ world\"
        
then the element becomes
    string="hello world"
        

o
std::vector<SplitPair> split(std::string const &str, char const *separators = " \t", bool addEmpty = false):
Same functionality as the previous split member, but the words vector is filled with pairs, of which the first elements represent the recognized strings, and the second elements are values of the String::Type enumeration. If addEmpty is requested, then the string elements contain the actual contents of the separator, while the Type elements are set to SEPARATOR. If the returned vector is not empty then the second member of the last element may be DQUOTE_UNTERMINATED, SQUOTE_UNTERMINATED, or ESCAPED_END in cases where the contents of str are ill-formed.
o
size_t split(std::vector<std::string> *words, std::string const &str, char const *separators = " \t", bool addEmpty = false):
Fills words with all elements of the str object, separated by any of the characters in separators. If the parameter addEmpty is set to true, the individual separators are stored as empty strings in words. If a word starts with " or ' all characters until a matching terminating " or ' at the end of a word are considered as one word. The surrounding quotes are not stored. The function returns the number of elements in the vector pointed to by words. This vector is initially cleared.
o
size_t split(std::vector<SplitPair> *words, std::string const &str, char const *separators = " \t", bool addEmpty = false):
Same functionality as the former member, but the words vector is filled with pairs, of which the first elements are the recognized strings, and the second elements values of the String::Type enumeration. If addEmpty is requested, then the string elements contain the actual contents of the separator, while the Type elements are set to SEPARATOR.
o
std::string unescape(std::string const &str):
Returns a copy of the str object in which the escaped (i.e., prefixed by a backslash) characters have been interpreted. All standard escape characters (\a, \b, \f, \n, \r, \t, \v) are recognized. If an escape character is followed by x the next two characters are interpreted as a hexadecimal number. If an escape character is followed by an octal digit, then the next three characters following the backslash are interpreted as an octal number. In all other cases, the backslash is removed and the character following the backslash is kept.
o
std::string uc(std::string const &str):
Returns a copy of the str object in which all letters were capitalized.

EXAMPLE

#include <iostream>
#include <vector>
#include <bobcat/string>
using namespace std;
using namespace FBB;
char const *type[] = 
{
    "DQUOTE_UNTERMINATED",
    "SQUOTE_UNTERMINATED",
    "ESCAPED_END",
    "SEPARATOR",
    "NORMAL",
    "DQUOTE",
    "SQUOTE",
};
int main(int argc, char **argv)
{
    cout << "Program's name in uppercase: " << String::uc(argv[0]) << endl;
    if (argc == 1)
        cout << "Provide any argument to suppress SEPARATOR fields\n";
    while (true)
    {
        cout << "Enter a line, or empty line to stop:" << endl;
        String line;
        if (!getline(cin, line) || !line.length())
            break;
        vector<String::SplitPair> splitpair;
        cout << "Split into " << line.split(&splitpair, " \t", argc == 1) << 
                " fields\n"; 
        for 
        (
            vector<String::SplitPair>::iterator it = splitpair.begin();
                it != splitpair.end();
                    ++it
        )
            cout << (it - splitpair.begin() + 1) << ": " <<
                    type[it->second] << ": `" << it->first << 
                    "', unescaped: `" << String(it->first).unescape() << 
                    "'" << endl;
    }
    return 0;
}
    

FILES

bobcat/string - defines the class interface

BUGS

None Reported.

DISTRIBUTION FILES

  • bobcat_4.02.00-x.dsc: detached signature;
  • bobcat_4.02.00-x.tar.gz: source archive;
  • bobcat_4.02.00-x_i386.changes: change log;
  • libbobcat1_4.02.00-x_*.deb: debian package holding the libraries;
  • libbobcat1-dev_4.02.00-x_*.deb: debian package holding the libraries, headers and manual pages;
  • http://sourceforge.net/projects/bobcat: public archive location;

BOBCAT

Bobcat is an acronym of `Brokken's Own Base Classes And Templates'.

COPYRIGHT

This is free software, distributed under the terms of the GNU General Public License (GPL).

AUTHOR

Frank B. Brokken ([email protected]).