|
|
|
1
Code to parse string date
9
Parsing newline-delimited data records in bash is simple, if you have this odd redirect up your sleeve. An annoying thing about bash is that it usually equates all whitespace characters, so the first block in the snippet won't let you use a file linewise, but will end up echoing each whitespace-delimited token on a separate line.
bash provides the "read" builtin which can be used to differentiate between newlines and spaces.
bash provides the "read" builtin which can be used to differentiate between newlines and spaces.
8
The modus operandi for this is similar to that taken by PHP's implementation of such functions. It's comparitively memory-intensive, but is much faster than running a whole bunch of tests.
Basically, you set a mask -- an array of 256 null bytes -- and set those that correspond to characters you wish to trim. Then, rather than having to test if a character is in the set of characters to trim(O(n), or linear time on *ws), you just test once (O(1), or unit time) to see if the byte in question is set.
And of course, to trim(), you just wrap trim() around both ltrim() and rtrim().
One point of caution: these functions trim in place, so copy strings before trimming them. (Of course, if you usually want access to both pre- and post-trimmed strings, you could always make these malloc() a new string and return a pointer to it . . . )
Basically, you set a mask -- an array of 256 null bytes -- and set those that correspond to characters you wish to trim. Then, rather than having to test if a character is in the set of characters to trim(O(n), or linear time on *ws), you just test once (O(1), or unit time) to see if the byte in question is set.
And of course, to trim(), you just wrap trim() around both ltrim() and rtrim().
One point of caution: these functions trim in place, so copy strings before trimming them. (Of course, if you usually want access to both pre- and post-trimmed strings, you could always make these malloc() a new string and return a pointer to it . . . )
0
wizard04
(Supported by JavaScript, maybe other languages)
0
wizard04
Functions for validating, parsing, and normalizing URIs and their parts.
If you find any errors, please leave a comment.
parseURI(str) splits a URI into its parts
parseQueryNumeric(str) splits a query string into its name/value pairs; returns a 2-D array
parseQueryAssociative(str) splits a query string into its name/value pairs; returns an associative array
parseURL(str) splits a URL (i.e., http(s) scheme URI) into its parts
normalizeURLDomain(domain) converts an obscured URL domain to a more readable one
normalizeIPv4(ip) normalizes an IPv4 address
normalizeIPv6(ip) normalizes an IPv6 address
normalizeURLPath(path) converts an obscured URL path to a more readable one
parseMailto(str) splits a mailto scheme URI into its parts
normalizeEmailAddress(str) converts an obscured email address to a more readable one; unfolds and removes comments
fixURL(str, domain) attempts to fix a URL if needed
fixHyperlink(str, domain, allowMailto) attempts to fix a hyperlink address (http(s) or mailto) if needed
For URLs, note that IPvFuture addresses are not supported.
If you find any errors, please leave a comment.
parseURI(str) splits a URI into its parts
parseQueryNumeric(str) splits a query string into its name/value pairs; returns a 2-D array
parseQueryAssociative(str) splits a query string into its name/value pairs; returns an associative array
parseURL(str) splits a URL (i.e., http(s) scheme URI) into its parts
normalizeURLDomain(domain) converts an obscured URL domain to a more readable one
normalizeIPv4(ip) normalizes an IPv4 address
normalizeIPv6(ip) normalizes an IPv6 address
normalizeURLPath(path) converts an obscured URL path to a more readable one
parseMailto(str) splits a mailto scheme URI into its parts
normalizeEmailAddress(str) converts an obscured email address to a more readable one; unfolds and removes comments
fixURL(str, domain) attempts to fix a URL if needed
fixHyperlink(str, domain, allowMailto) attempts to fix a hyperlink address (http(s) or mailto) if needed
For URLs, note that IPvFuture addresses are not supported.







