This allows us to check files without bringing them entirely into
memory. Also makes it easier to find the disorder in
(seq 9; echo 0) | sort --check
(points at the end of the file, where our previous version would
point at the start of the file)
Itertools' .coalesce() was the most useful helper that I could find
for comparing adjacent values in an iterator. It is designed for
implementing things like .dedup(), so the resulting code is a little
unintuitive.
FileMerger receives Lines Iterables of the pre-sorted input files
via push_file() It implements Iterator, which yields lines from the
input files in (merged) sorted order. If the input files are not sorted,
then the behavior is undefined.
Internally, FileMerger uses a
std::collections::BinaryHeap<MergeableFile>.
MergeableFile is an internal helper that implements Ord in a way that
BinaryHeap can use (note that we want smallest-first, but BinaryHeap
returns largest first, so MergeableFile::cmp() calls reverse() on
whatever compare_by() returns.
Made a new function sort_by(lines, compare_fns), which accepts a
list of compare_fns and calls lines.sort_by() with a closure that
calls each compare_fn in turn until one returns something other
than equal.
Default behavior ensures that String::cmp is the last element in the
compare_fns list (referred to as 'last resort' sorting by man sort).
Passing --stable (-s) turns this behaviour off.
Test cases provided for `sort --month` and `sort --month --stable`.
* Add options -c, -F, -L, -l, -r, -R, -S, -t, -U, --color
* Fix options -a, -A
* Remove unused options
* Output in columns when not using -l
* Output date with -l
The all flag did not cull/remove the directory entries starting with a
dot. The help message indicates it should. The implementation checks
if the string starts with a dot whilst also using '-a' to determine
whether a DirEntry is to be printed.
I forgot that -v refers to "verbose" and not "version"
when making earlier changes. So I fixed that and for
good measure added the verbose flag anyway.
* Added flag -t/--target-directory
* No longer assumes that the source arguments are files in the CWD (in other words, can copy files from directories other than CWD)
We now accept symbolic and numeric mode strings using the
--mode or -m option for install. This is used either when
moving files into a directory, or when creating component
directories with the -d option. This feature was designed
to mirror the GNU implementation, including the possibly
quirky behaviour of `install --mode=u+wx file dir`
resulting in dir/file having exactly permissions 0300.
Extensive integration tests are included.
This chnage required a higher libc dependency.
We check if the user has given one of the (many)
not yet implemented command line arguments. Upon
catching this, we display the specific transgressor
to stderr and exit with return code 2.
This behaviour is tested in one new integration test.
Bare minimum functionality of `install file dir` implemented.
Also added TODO markers in code for outstanding parameters
and split main function into smaller logical chunks.
Add install utility skeleton source, based on
mv, including the getopts setup mirroring
GNU's `man install` documentation. Also
add a single test and build system code.
Before each line of content is printed, check if it's from a different
file than the last one we printed for. If so, print a '==> file <=='
header to separate the output in the way tail does.
If multiple files are passed as arguments with the -f option, a vector
of BufReaders is built as the files are first tailed, so that follow()
can take control for the rest of the time the program is running.
follow() loops over each reader and prints all new available content on
each file before moving on to the next.
To get the -f option to follow multiple files, bounded_tail should just
tail a single file and return, instead of blocking processing of other
files by calling follow() (which loops forever).
Makes `parse_size` return a `Result` where the `Err` part indicates whether
there was a parsing error, or the parse size is too big to store. Also makes the
value parsed a `u64` rather than a `usize`.
Adds unit tests for `parse_size` and integration tests using the suffix
multiplier in a number passed with the `-n` flag.
The main issue is that -octal or -[rwx] is interpreted as an option by
getopts.
Search the args for such a pattern, remove it before parsing and
manually handle it afterwards.
Fixes#788.
When tailing a file, as opposed to stdin, and we are tailing bytes rather than
lines, we can seek the requested number of bytes from the end of the file. This
side steps the whole `backwards_thru_file` file loop and blocks of reads.
Fixes#833.
When tail'ing a file, we do not need to read the whole file from start to finish
just to find the last n lines or bytes. Instead, we can seek to the end of the
file, and then read the file "backwards" in chunks until we find the location of
the first line/byte we wish to print. This ends up being a nice performance win
for very large files.
Fixes#764
The `BufReader` argument passed to the `fn tail<T: Read>(&mut BufReader<T>,
settings: &settings)` function is never reused, so the `tail` function should
just take ownership of it.
calling install goal overrides utility build settings with utility install settings
calling install goal defaults profile to --release
PROG_PREFIX is now applied to all utilities
modify uutils.rs to make symbolic link bins possible
binary install paths rmd first to prevent errors due to lns
simplify vars for more readable install target
other minor fixes
In order to work around lines() removing the newline byte and CRLF, I
switched from the iterator methods (lines/bytes) to the direct methods
(read_line/read). I also manually skipped lines/bytes.
Fixes#744.
For coreutils, there are two build artifacts:
1. multicall executable (each utility is a separate static library)
2. individual utilities (still separate library with main wrapper)
To avoid namespace collision, each utility crate is defined as
"uu_{CMD}". The end user only sees the original utility name. This
simplifies build.rs.
Also, the thin wrapper for the main() function is no longer contained in
the crate. It has been separated into a dedicated file. This was
necessary to work around Cargo's need for the crate name attribute to
match the name in the respective Cargo.toml.
Since several utilities check if the standard streams are interactive, I
moved this into the uucore::fs library as is_std*_interactive(). I also
added Windows support for these methods, which only return false (or at
least until someone finds a way to support this).
When determining the range from which to select portions of a line, the
upper limit of the range is a usize. The maximum upper value is
usize::MAX, but at one point this value is incremented, causing an
overflow. By setting the maximum upper value to usize::MAX-1, the bug is
averted. Since the upper limit of the range is an index (thus, ranging
from 0 to 2^64-1 for 64-bit platforms), the maximum usize should not be
reached.
I separated test's main() into a separate file to override Cargo's
requirement for matching crate names. I had to update the build command
to use a special extern reference for test.
Fixes issues caused by #728.
To avoid linking issues with Rust's libtest, the crate for the test
utility was changed to 'uutest'. However, the user doesn't need to see
this so a few hoops were jumped through to make this transparent.
I also updated the make rules to build the individual features first and
then uutils. This makes 'make && make test' look more organized.
Everything in src/common has been moved to src/uucore. This is defined
as a Cargo library, instead of directly included. This gives us
flexibility to make the library an external crate in the future.
Fixes#717.
Implemented as follows:
Usage: expr EXPRESSION
or: expr OPTION
--help display this help and exit
--version output version information and exit
Print the value of EXPRESSION to standard output. A blank line below
separates increasing precedence groups. EXPRESSION may be:
ARG1 | ARG2 ARG1 if it is neither null nor 0, otherwise ARG2
ARG1 & ARG2 ARG1 if neither argument is null or 0, otherwise 0
ARG1 < ARG2 ARG1 is less than ARG2
ARG1 <= ARG2 ARG1 is less than or equal to ARG2
ARG1 = ARG2 ARG1 is equal to ARG2
ARG1 != ARG2 ARG1 is unequal to ARG2
ARG1 >= ARG2 ARG1 is greater than or equal to ARG2
ARG1 > ARG2 ARG1 is greater than ARG2
ARG1 + ARG2 arithmetic sum of ARG1 and ARG2
ARG1 - ARG2 arithmetic difference of ARG1 and ARG2
ARG1 * ARG2 arithmetic product of ARG1 and ARG2
ARG1 / ARG2 arithmetic quotient of ARG1 divided by ARG2
ARG1 % ARG2 arithmetic remainder of ARG1 divided by ARG2
STRING : REGEXP [NOT IMPLEMENTED] anchored pattern match of REGEXP in STRING
match STRING REGEXP [NOT IMPLEMENTED] same as STRING : REGEXP
substr STRING POS LENGTH [NOT IMPLEMENTED] substring of STRING, POS counted from 1
index STRING CHARS [NOT IMPLEMENTED] index in STRING where any CHARS is found, or 0
length STRING [NOT IMPLEMENTED] length of STRING
+ TOKEN interpret TOKEN as a string, even if it is a
keyword like 'match' or an operator like '/'
( EXPRESSION ) value of EXPRESSION
Beware that many operators need to be escaped or quoted for shells.
Comparisons are arithmetic if both ARGs are numbers, else lexicographical.
Pattern matches return the string matched between \( and \) or null; if
\( and \) are not used, they return the number of characters matched or 0.
Exit status is 0 if EXPRESSION is neither null nor 0, 1 if EXPRESSION is null
or 0, 2 if EXPRESSION is syntactically invalid, and 3 if an error occurred.
Environment variables:
* EXPR_DEBUG_TOKENS=1 dump expression's tokens
* EXPR_DEBUG_RPN=1 dump expression represented in reverse polish notation
* EXPR_DEBUG_SYA_STEP=1 dump each parser step
* EXPR_DEBUG_AST=1 dump expression represented abstract syntax tree
Builds the uutils multicall binary containing all utils (except stdbuf)
by default. To only build a subset
`cargo --no-default-features --features <utils>`
can be used.
Whats missing is building the standalone binaries and a mechanism to
automatically disable the build of unix only utils on windows.
I cleaned up string references, whitespace, and use of unstable
features. I also added a comment about reverting to connect, making
others aware that the method should be replaced by join after 1.3.
We are using connect() instead of join() until Rust 1.3 is stable.
Currently, connect() is just a thin wrapper over join(). Keeping the
deprecated method allows us to build on all releases.
The method, fs::canonicalize(), is unstable and can't be used for stable
builds. We already have our own implementation of canonicalize(), which
supports more options than the Rust library implementation.
Improve handling of unicode on Windows
Disable a few crates on Windows that abuse unix APIs too much
Signed-off-by: Peter Atashian <retep998@gmail.com>
I removed unused linker flags, added platform-specific linker flags, and
used DYLD_LIBRARY_PATH (instead of DYLD_INSERT_LIBRARIES) for loading
the dynamic library. I also removed an unused variable mutation.
There are several areas needing improvement:
1) add tests for hard links
2) add implementation for uncommon flags (-d, -L, -n, -P, -r)
3) align error messages more closely with GNU implementation
I switched over to the getopts crate on crates.io, instead of Rust's
private implementation. This will allow coreutils to build for Rust 1.0.
I'm splitting the updates into several commits for easier reviewing.
I switched over to the getopts crate on crates.io, instead of Rust's
private implementation. This will allow coreutils to build for Rust 1.0.
I'm splitting the updates into several commits for better reviewing.
I switched over to the getopts crate on crates.io, instead of Rust's
private implementation. This will allow coreutils to build for Rust 1.0.
I'm splitting the updates into several commits for better reviewing.