module CharStream:Character streams.
CharStream module provides a position-based interface to character streams. The streams are optimized for applications that mostly read a stream sequentially and occasionally backtrack over a bounded distance, which is a common usage pattern of backtracking parsers.
The characters in a character stream provides by this module are accessed based on their position in the stream. A position
pos is valid in the stream
s if it satisfies
0 <= pos < length s. Character streams can be created from input channels and from strings.
val from_string :
string -> t
from_string screates a character stream that contains the characters of the string
s. The string is not copied, hence subsequent modifications to it are visible from the stream.
val from_channel :
?min_rspace:int -> Pervasives.in_channel -> t
from_channel ?block_size ?block_overlap ?min_rspace chncreates a character stream that contains the characters of the input channel
chn. The behavior of the stream is undefined if the channel is modified subsequently.
When a character stream is created from an input channel, the characters in this channel are read in overlapping blocks, where
block_overlap determine the size of a block and the amount of overlap. If the length of the channel is not greater than the block size, the whole channel is read at once. Otherwise, only a single block of the channel is kept in memory at a time. Whenever the current stream position leaves the part that is currently kept in memory, a new block is read from the channel. The channel must support seeking (i.e., must be created from a regular file) to enable this. If possible, blocks are read with the specified amount of overlap to minimize the re-reading of blocks due to backtracking.
min_rspace specifies the minimum number of characters a regular expression is matched against (if possible) by
Invalid_argument if the arguments are invalid.
1048576characters, valid range:
1 <= block_size <= Sys.max_string_length.
block_size / 16, valid range:
1 <= block_overlap <= block_size / 2.
block_size / 64, valid range:
1 <= min_rspace <= block_overlap.
val length :
t -> int
length sreturns the number of characters in the stream
val seek :
t -> int -> unit
seek s posprepares the stream for reading from position
posis outside the block currently held in memory, a block containing
posis read, replacing the old block.
posis not a valid stream position.
val read_char :
t -> int -> char option
read_char s posreturns
cis the character at position
Noneif this position is not a valid position in
val read_string :
t -> int -> int -> string
read_string s pos maxlenreturns a string containing the next
nis the minimum of
maxlenand the number of characters remaining from position
posis not a valid position in
s, the empty string is returned.
val match_char :
t -> int -> char -> bool
match_char s pos cis equivalent to
read_char s pos = Some c.
val match_string :
t -> int -> string -> bool
match_string s pos stris equivalent to
read_string s pos (String.length str) = str.
val match_regexp :
t -> int -> regexp -> substrings option
match_regexp s pos rexmatches the regular expression
rexagainst the characters in
sstarting at position
pos. It returns
Some substringsif the match succeeds, where
substringscontains the matched substrings. If the match fails or if
posis not a valid position in
s, it returns
It is not guaranteed that
rex is matched against the complete substream starting at the current position. The
min_rspace parameter of
CharStream.from_channel specifies the minimum number of characters avaliable for matching.