CHRPAK
Strings and Characters
CHRPAK
is a C++ library which
handles characters and strings.
CHRPAK began when I simply wanted to be able to capitalize
a string. Now it has expanded to a number of interesting uses.
Many unusual situations are provided for, including
-
string '31.2' <=> numeric value 31.2;
-
uppercase <=> lowercase;
-
removal of control characters or blanks;
-
sorting, merging, searching.
Many of the routine names begin with the name of the data type they
operate on:
-
B4 - a 4 byte word;
-
CH - a character;
-
CHVEC - a vector of characters;
-
DEC - a decimal fraction;
-
DIGIT - a character representing a numeric digit;
-
I4 - an integer;
-
R4 - a real;
-
R8 - a double precision real;
-
RAT - a ratio I/J;
-
S - a string;
-
SVEC - a vector of strings;
-
SVECI - a vector of strings, implicitly capitalized;
Licensing:
The computer code and data files described and made available on this web page
are distributed under
the GNU LGPL license.
Languages:
CHRPAK is available in
a C version and
a C++ version and
a FORTRAN77 version and
a FORTRAN90 version and
a MATLAB version.
Reference:
-
Carl Branden, John Tooze,
Introduction to Protein Structure,
Second Edition,
Garland Publishing, 1999,
ISBN: 0815323050,
LC: QP551.B7635.
-
Paul Bratley, Bennett Fox, Linus Schrage,
A Guide to Simulation,
Second Edition,
Springer, 1987,
ISBN: 0387964673,
LC: QA76.9.C65.B73.
-
IEEE Standards Committee 754,
IEEE Standard for Binary Floating Point Arithmetic,
ANSI/IEEE Standard 754-1985,
SIGPLAN Notices,
Volume 22, Number 2, 1987, pages 9-25.
-
Donald Knuth,
The Art of Computer Programming,
Volume 3, Sorting and Searching,
Second Edition,
Addison Wesley, 1998,
ISBN: 0201896850,
LC: QA76.6.K64.
-
Albert Nijenhuis, Herbert Wilf,
Combinatorial Algorithms for Computers and Calculators,
Academic Press, 1978,
ISBN: 0-12-519260-6,
LC: QA164.N54.
Source Code:
Examples and Tests:
List of Routines:
-
A_TO_I4 returns the index of an alphabetic character.
-
BASE_TO_I4 returns the value of an integer represented in some base.
-
BYTE_TO_INT converts 4 bytes into an unsigned integer.
-
CH_CAP capitalizes a single character.
-
CH_COUNT_CVEC_ADD adds a character vector to a character count.
-
CH_COUNT_FILE_ADD adds characters in a file to a character count.
-
CH_COUNT_INIT initializes a character count.
-
CH_COUNT_PRINT prints a set of character counts.
-
CH_COUNT_S_ADD adds a character string to a character histogram.
-
CH_EQI is true if two characters are equal, disregarding case.
-
CH_INDEX_FIRST finds the first occurrence of a character in a string.
-
CH_INDEX_LAST finds the last occurrence of a character in a string.
-
CH_IS_ALPHA is TRUE if a charaacter is alphabetic.
-
CH_IS_ALPHANUMERIC is TRUE if a character is alphanumeric.
-
CH_IS_CONTROL is TRUE if a character is a control character.
-
CH_IS_DIGIT returns TRUE if a character is a decimal digit.
-
CH_IS_FORMAT_CODE returns TRUE if a character is a FORTRAN format code.
-
CH_IS_LOWER is TRUE if C is a lowercase alphabetic character.
-
CH_IS_PRINTABLE determines if a character is printable.
-
CH_IS_SPACE is TRUE if a character represents "white space".
-
CH_IS_UPPER is TRUE if C is an uppercase alphabetic character.
-
CH_LOW lowercases a single character.
-
CH_PAD "pads" a character in a string with a blank on either side.
-
CH_READ reads one character from a binary file.
-
CH_SCRABBLE returns the character on a given Scrabble tile.
-
CH_SWAP swaps two characters.
-
CH_TO_DIGIT returns the integer value of a base 10 digit.
-
CH_TO_DIGIT_BIN returns the integer value of a binary digit.
-
CH_TO_DIGIT_OCT returns the integer value of an octal digit.
-
CH_TO_ROT13 converts a character to its ROT13 equivalent.
-
CH_UNIFORM returns a random character in a given range.
-
CH_WRITE writes one character to a binary file.
-
CHARSTAR_ADJUSTL flushes a CHAR* string left.
-
CHARSTAR_CAT concatenates two CHAR*'s to make a third.
-
CHARSTAR_EQI reports whether two CHAR*'s are equal, ignoring case.
-
CHARSTAR_LEN_TRIM returns the length of a CHAR* to the last nonblank.
-
DIGIT_BIN_TO_CH returns the character representation of a binary digit.
-
DIGIT_INC increments a decimal digit.
-
DIGIT_OCT_TO_CH returns the character representation of an octal digit.
-
DIGIT_TO_CH returns the base 10 digit character corresponding to a digit.
-
GETBITS returns N bits from an unsigned int X, beginning at position P.
-
HEX_DIGIT_TO_I4 converts a hexadecimal digit to an I4.
-
HEX_TO_BINARY_DIGITS converts a hexadecimal digit to 4 binary digits.
-
HEX_TO_I4 converts a hexadecimal string to its integer value.
-
I4_HUGE returns a "huge" I4, usually the largest legal signed int.
-
I4_INPUT prints a prompt string and reads an I4 from the user.
-
I4_LOG_10 returns the whole part of the logarithm base 10 of an I4.
-
I4_MAX returns the maximum of two I4's.
-
I4_MIN returns the smaller of two I4's.
-
I4_SWAP switches two I4's.
-
I4_TO_A returns the I-th alphabetic character.
-
I4_TO_AMINO_CODE converts an integer to an amino code.
-
I4_TO_HEX_DIGIT converts a (small) I4 to a hexadecimal digit.
-
I4_TO_ISBN converts an I4 to an ISBN digit.
-
I4_TO_MONTH_ABB returns an abbreviated month name.
-
I4_TO_S converts an I4 to a string.
-
I4_TO_STRING converts an I4 to a C++ string.
-
I4_TO_UNARY produces the "base 1" representation of an I4.
-
I4_UNIFORM returns a scaled pseudorandom I4.
-
I4VEC_INDICATOR sets an I4VEC to the indicator vector.
-
I4VEC_PRINT prints an I4VEC.
-
INT_TO_BYTE converts an unsigned integer into 4 bytes.
-
ISBN_TO_I4 converts an ISBN character into an I4.
-
PERM_CHECK checks that a vector represents a permutation.
-
PERM_UNIFORM selects a random permutation of N objects.
-
PRINT_SIZES reports the size in bytes of various data types.
-
R4_ABS returns the absolute value of an R4.
-
R4_NINT returns the nearest integer to an R4.
-
R4_TO_STRING converts an R4 to a C++ string.
-
R8_TO_STRING converts an R8 to a C++ string.
-
R8_UNIFORM_01 returns a unit pseudorandom R8.
-
REVERSE_BYTES_FLOAT reverses the four bytes in a float.
-
REVERSE_BYTES_INT reverses the four bytes in an int.
-
S_ADJUSTL flushes a string left.
-
S_BEGIN reports whether string 1 begins with string 2, ignoring case.
-
S_BEHEAD_SUBSTRING "beheads" a string, removing a given substring.
-
S_BLANK_DELETE removes blanks and left justifies the remainder.
-
S_BLANKS_DELETE replaces consecutive blanks by one blank.
-
S_CAP capitalizes all the characters in a string.
-
S_CH_COUNT counts occurrences of a particular character in a string.
-
S_CONTROL_BLANK replaces control characters with blanks.
-
S_EQI reports whether two strings are equal, ignoring case.
-
S_ESCAPE_TEX de-escapes TeX escape sequences.
-
S_FIRST_CH points to the first occurrence of a character in a string.
-
S_FIRST_NONBLANK points to the first nonblank character in a string.
-
S_INC_C "increments" the characters in a string.
-
S_INC_N increments the digits in a string.
-
S_LAST_CH points to the last occurrence of a character in a string.
-
S_LEN_TRIM returns the length of a string to the last nonblank.
-
S_LOW lowercases a string.
-
S_NEWLINE_TO_NULL replaces carriage returns or newlines by nulls.
-
S_NONALPHA_DELETE removes nonalphabetic characters from a string.
-
S_REPLACE_CH replaces all occurrences of one character by another.
-
S_REVERSE reverses the characters in a string.
-
S_S_SUBANAGRAM determines if S2 is a "subanagram" of S1.
-
S_S_SUBANAGRAM_SORTED determines if S2 is a "subanagram" of S1.
-
S_SORT_A sorts a string into ascending order.
-
S_SUBSTRING returns a substring of a given string.
-
S_TAB_BLANK replaces each TAB character by a space.
-
S_TO_FORMAT reads a FORTRAN format from a string.
-
S_TO_I4 reads an I4 from a string.
-
S_TO_I4VEC reads an I4VEC from a string.
-
S_TO_L reads an L from a string.
-
S_TO_R4 reads an R4 from a string.
-
S_TO_R4VEC reads an R4VEC from a string.
-
S_TO_R8 reads an R8 from a string.
-
S_TO_R8VEC reads an R8VEC from a string.
-
S_TO_ROT13 "rotates" the alphabetical characters in a string by 13 positions.
-
S_TRIM promotes the final null forward through trailing blanks.
-
S_WORD_CAP capitalizes the first character of each word in a string.
-
S_WORD_COUNT counts the number of "words" in a string.
-
S_WORD_EXTRACT_FIRST extracts the first word from a string.
-
SORT_HEAP_EXTERNAL externally sorts a list of items into ascending order.
-
SWAP_BYTES_FLOAT swaps pairs of bytes in a float.
-
SWAP_BYTES_INT swaps pairs of bytes in an int.
-
TIMESTAMP prints the current YMDHMS date as a time stamp.
-
WORD_NEXT_READ "reads" words from a string, one at a time.
You can go up one level to
the C++ source codes.
Last revised on 27 April 2011.