Path::is_dir follows symlinks so it returns true for symlinks
to directories. Use symlink_metadata instead so you can remove
symlinks to directories without -r flag.
Currently, mkdir always succeeds for existing files and it
even modifies their mode. With this change, only mkdir -p for
existing directories will be allowed.
Make at_line_start persist between printing each file. This fixes an
issue when numbering lines in the output and one of the input files
does not have a trailing newline.
- adds conditional supports for unix domain sockets
- adds unix domain socket test
- adds Results to functions, removing unwraps
- uutils `cat` used to panic on broken stdout pipes (e.g. `cat
/dev/zero | head -c1`). this is fixed in this PR
- updated to exit 0 on success, and 1 if an error occurs.
- adds docstrings
- adds an error log on printing a directory
- adds categorization of other filetypes for extensible
differentiation of behaviors
- adds OutputOptions struct to replace params for extensibility
- adds correct status code on exit
Fixes#1017.
test_mkdir_dup_dir asserted that creating an existing directory is an
error, but that's not how GNU coreutils behaves. This has been reported
in #121, but wasn't fixed (only the `-p` case was).
Fuchsia uses musl as its libc; musl only has stub implementation
for utmpx. From their wiki, that is deliberately chosen.
Fuchsia doesn't have signals mechanism.
* update status in README.md
* enable busybox tests
Adding `CONFIG_DESKTOP` and `CONFIG_LONG_OPTS` to busybox config.
These flags also enable other tests, but those utilities are not
included in `TEST_PROGS`. (eg. awk)
* fix whitespace and small issues
* fix Eq imp for FormatWriter on nightly + beta
* fix indention in multifilereader.rs
* fix intermittent errors in tests
Rewrote cat to eliminate code duplication and make it safe
- UnsafeWriter is replaced by BufWriter
- write_lines (any option except -T and -v) and write_bytes (-T and -v
options) are replaced by single write_lines method. A new method use
``write_to_end``, ``write_tab_to_end`` or ``write_nonprint_to_end``
method to write all symbols untill the end of line in the right way.
- Benchmarking (-n, -T and -v options respectively):
| old (ns/iter) | new (ns/iter) |
| -------------------------- | -------------------------- |
| 6,501,496 (+/- 1,173,481) | 6,683,158 (+/- 373,539) |
| 8,634,023 (+/- 547,595) | 5,408,676 (+/- 715,458) |
| 24,056,507 (+/- 1,177,445) | 30,879,788 (+/- 1,180,598) |
main panics when following /dev/stdin since /dev/stdin is not seekable.
Check to see if file is seekable and use unbounded_seek if so.
Also `tail -f` with no files should not follow stdin.
This allows us to check files without bringing them entirely into
memory. Also makes it easier to find the disorder in
(seq 9; echo 0) | sort --check
(points at the end of the file, where our previous version would
point at the start of the file)
Itertools' .coalesce() was the most useful helper that I could find
for comparing adjacent values in an iterator. It is designed for
implementing things like .dedup(), so the resulting code is a little
unintuitive.
FileMerger receives Lines Iterables of the pre-sorted input files
via push_file() It implements Iterator, which yields lines from the
input files in (merged) sorted order. If the input files are not sorted,
then the behavior is undefined.
Internally, FileMerger uses a
std::collections::BinaryHeap<MergeableFile>.
MergeableFile is an internal helper that implements Ord in a way that
BinaryHeap can use (note that we want smallest-first, but BinaryHeap
returns largest first, so MergeableFile::cmp() calls reverse() on
whatever compare_by() returns.
Made a new function sort_by(lines, compare_fns), which accepts a
list of compare_fns and calls lines.sort_by() with a closure that
calls each compare_fn in turn until one returns something other
than equal.
Default behavior ensures that String::cmp is the last element in the
compare_fns list (referred to as 'last resort' sorting by man sort).
Passing --stable (-s) turns this behaviour off.
Test cases provided for `sort --month` and `sort --month --stable`.
* Add options -c, -F, -L, -l, -r, -R, -S, -t, -U, --color
* Fix options -a, -A
* Remove unused options
* Output in columns when not using -l
* Output date with -l
The all flag did not cull/remove the directory entries starting with a
dot. The help message indicates it should. The implementation checks
if the string starts with a dot whilst also using '-a' to determine
whether a DirEntry is to be printed.
I forgot that -v refers to "verbose" and not "version"
when making earlier changes. So I fixed that and for
good measure added the verbose flag anyway.
* Added flag -t/--target-directory
* No longer assumes that the source arguments are files in the CWD (in other words, can copy files from directories other than CWD)
We now accept symbolic and numeric mode strings using the
--mode or -m option for install. This is used either when
moving files into a directory, or when creating component
directories with the -d option. This feature was designed
to mirror the GNU implementation, including the possibly
quirky behaviour of `install --mode=u+wx file dir`
resulting in dir/file having exactly permissions 0300.
Extensive integration tests are included.
This chnage required a higher libc dependency.
We check if the user has given one of the (many)
not yet implemented command line arguments. Upon
catching this, we display the specific transgressor
to stderr and exit with return code 2.
This behaviour is tested in one new integration test.
Bare minimum functionality of `install file dir` implemented.
Also added TODO markers in code for outstanding parameters
and split main function into smaller logical chunks.
Add install utility skeleton source, based on
mv, including the getopts setup mirroring
GNU's `man install` documentation. Also
add a single test and build system code.
Before each line of content is printed, check if it's from a different
file than the last one we printed for. If so, print a '==> file <=='
header to separate the output in the way tail does.
If multiple files are passed as arguments with the -f option, a vector
of BufReaders is built as the files are first tailed, so that follow()
can take control for the rest of the time the program is running.
follow() loops over each reader and prints all new available content on
each file before moving on to the next.
To get the -f option to follow multiple files, bounded_tail should just
tail a single file and return, instead of blocking processing of other
files by calling follow() (which loops forever).
Makes `parse_size` return a `Result` where the `Err` part indicates whether
there was a parsing error, or the parse size is too big to store. Also makes the
value parsed a `u64` rather than a `usize`.
Adds unit tests for `parse_size` and integration tests using the suffix
multiplier in a number passed with the `-n` flag.
The main issue is that -octal or -[rwx] is interpreted as an option by
getopts.
Search the args for such a pattern, remove it before parsing and
manually handle it afterwards.
Fixes#788.
When tailing a file, as opposed to stdin, and we are tailing bytes rather than
lines, we can seek the requested number of bytes from the end of the file. This
side steps the whole `backwards_thru_file` file loop and blocks of reads.
Fixes#833.
When tail'ing a file, we do not need to read the whole file from start to finish
just to find the last n lines or bytes. Instead, we can seek to the end of the
file, and then read the file "backwards" in chunks until we find the location of
the first line/byte we wish to print. This ends up being a nice performance win
for very large files.
Fixes#764
The `BufReader` argument passed to the `fn tail<T: Read>(&mut BufReader<T>,
settings: &settings)` function is never reused, so the `tail` function should
just take ownership of it.
calling install goal overrides utility build settings with utility install settings
calling install goal defaults profile to --release
PROG_PREFIX is now applied to all utilities
modify uutils.rs to make symbolic link bins possible
binary install paths rmd first to prevent errors due to lns
simplify vars for more readable install target
other minor fixes
In order to work around lines() removing the newline byte and CRLF, I
switched from the iterator methods (lines/bytes) to the direct methods
(read_line/read). I also manually skipped lines/bytes.
Fixes#744.
For coreutils, there are two build artifacts:
1. multicall executable (each utility is a separate static library)
2. individual utilities (still separate library with main wrapper)
To avoid namespace collision, each utility crate is defined as
"uu_{CMD}". The end user only sees the original utility name. This
simplifies build.rs.
Also, the thin wrapper for the main() function is no longer contained in
the crate. It has been separated into a dedicated file. This was
necessary to work around Cargo's need for the crate name attribute to
match the name in the respective Cargo.toml.
Since several utilities check if the standard streams are interactive, I
moved this into the uucore::fs library as is_std*_interactive(). I also
added Windows support for these methods, which only return false (or at
least until someone finds a way to support this).
When determining the range from which to select portions of a line, the
upper limit of the range is a usize. The maximum upper value is
usize::MAX, but at one point this value is incremented, causing an
overflow. By setting the maximum upper value to usize::MAX-1, the bug is
averted. Since the upper limit of the range is an index (thus, ranging
from 0 to 2^64-1 for 64-bit platforms), the maximum usize should not be
reached.
I separated test's main() into a separate file to override Cargo's
requirement for matching crate names. I had to update the build command
to use a special extern reference for test.
Fixes issues caused by #728.
To avoid linking issues with Rust's libtest, the crate for the test
utility was changed to 'uutest'. However, the user doesn't need to see
this so a few hoops were jumped through to make this transparent.
I also updated the make rules to build the individual features first and
then uutils. This makes 'make && make test' look more organized.
Everything in src/common has been moved to src/uucore. This is defined
as a Cargo library, instead of directly included. This gives us
flexibility to make the library an external crate in the future.
Fixes#717.
Implemented as follows:
Usage: expr EXPRESSION
or: expr OPTION
--help display this help and exit
--version output version information and exit
Print the value of EXPRESSION to standard output. A blank line below
separates increasing precedence groups. EXPRESSION may be:
ARG1 | ARG2 ARG1 if it is neither null nor 0, otherwise ARG2
ARG1 & ARG2 ARG1 if neither argument is null or 0, otherwise 0
ARG1 < ARG2 ARG1 is less than ARG2
ARG1 <= ARG2 ARG1 is less than or equal to ARG2
ARG1 = ARG2 ARG1 is equal to ARG2
ARG1 != ARG2 ARG1 is unequal to ARG2
ARG1 >= ARG2 ARG1 is greater than or equal to ARG2
ARG1 > ARG2 ARG1 is greater than ARG2
ARG1 + ARG2 arithmetic sum of ARG1 and ARG2
ARG1 - ARG2 arithmetic difference of ARG1 and ARG2
ARG1 * ARG2 arithmetic product of ARG1 and ARG2
ARG1 / ARG2 arithmetic quotient of ARG1 divided by ARG2
ARG1 % ARG2 arithmetic remainder of ARG1 divided by ARG2
STRING : REGEXP [NOT IMPLEMENTED] anchored pattern match of REGEXP in STRING
match STRING REGEXP [NOT IMPLEMENTED] same as STRING : REGEXP
substr STRING POS LENGTH [NOT IMPLEMENTED] substring of STRING, POS counted from 1
index STRING CHARS [NOT IMPLEMENTED] index in STRING where any CHARS is found, or 0
length STRING [NOT IMPLEMENTED] length of STRING
+ TOKEN interpret TOKEN as a string, even if it is a
keyword like 'match' or an operator like '/'
( EXPRESSION ) value of EXPRESSION
Beware that many operators need to be escaped or quoted for shells.
Comparisons are arithmetic if both ARGs are numbers, else lexicographical.
Pattern matches return the string matched between \( and \) or null; if
\( and \) are not used, they return the number of characters matched or 0.
Exit status is 0 if EXPRESSION is neither null nor 0, 1 if EXPRESSION is null
or 0, 2 if EXPRESSION is syntactically invalid, and 3 if an error occurred.
Environment variables:
* EXPR_DEBUG_TOKENS=1 dump expression's tokens
* EXPR_DEBUG_RPN=1 dump expression represented in reverse polish notation
* EXPR_DEBUG_SYA_STEP=1 dump each parser step
* EXPR_DEBUG_AST=1 dump expression represented abstract syntax tree
Builds the uutils multicall binary containing all utils (except stdbuf)
by default. To only build a subset
`cargo --no-default-features --features <utils>`
can be used.
Whats missing is building the standalone binaries and a mechanism to
automatically disable the build of unix only utils on windows.
I cleaned up string references, whitespace, and use of unstable
features. I also added a comment about reverting to connect, making
others aware that the method should be replaced by join after 1.3.
We are using connect() instead of join() until Rust 1.3 is stable.
Currently, connect() is just a thin wrapper over join(). Keeping the
deprecated method allows us to build on all releases.
The method, fs::canonicalize(), is unstable and can't be used for stable
builds. We already have our own implementation of canonicalize(), which
supports more options than the Rust library implementation.
Improve handling of unicode on Windows
Disable a few crates on Windows that abuse unix APIs too much
Signed-off-by: Peter Atashian <retep998@gmail.com>
I removed unused linker flags, added platform-specific linker flags, and
used DYLD_LIBRARY_PATH (instead of DYLD_INSERT_LIBRARIES) for loading
the dynamic library. I also removed an unused variable mutation.
There are several areas needing improvement:
1) add tests for hard links
2) add implementation for uncommon flags (-d, -L, -n, -P, -r)
3) align error messages more closely with GNU implementation
I switched over to the getopts crate on crates.io, instead of Rust's
private implementation. This will allow coreutils to build for Rust 1.0.
I'm splitting the updates into several commits for easier reviewing.
I switched over to the getopts crate on crates.io, instead of Rust's
private implementation. This will allow coreutils to build for Rust 1.0.
I'm splitting the updates into several commits for better reviewing.
I switched over to the getopts crate on crates.io, instead of Rust's
private implementation. This will allow coreutils to build for Rust 1.0.
I'm splitting the updates into several commits for better reviewing.
This commit adds `cargo update` to the distclean target in the
makefile. This updates the Cargo.lock file when clearing the
deps directory.
In addition, it adds a faster implementation of the Sieve of
Eratosthenes for use by `src/factor/gen_table.rs` and `test/factor.rs`.
In addition to upgrading the nightly build, I flattened the Stat struct
to embed the metadata fields. This simplified access to the values, but
needed a constructor method for ergonomic reasons.
In addition to upgrading to the nightly build, I refactored the method
that creates the directories by switching from a recursive approach to
an iterative one. I also replaced the obsolete fs::mkdir() with a custom
method using fs::create_dir() and libc::chmod(). I added several
diagnostic messages that match the GNU implementation.
I updated to the nightly build, completed support for the verbose flag,
and refactored the canonicalization method to simplify and add support
for Windows paths.
This commit updates `cut` to build on rust nightly.
In addition, it adds support for null input and output delimiters,
and fixes a bug in the `cut_characters()` function that would cause
incorrect output when two adjacent fields were specified in the range
list.
Aside from the usual upgrades to sync with the nightly build, I fixed an
unwrap() panic when reading lines with only a newline. I also refactored
the repeated command calls to use helper functions.
I created random data to test several cases. I verified that the data is
split into the correct number of files and can also be reassembled into
the original file.
The GNU implementation first strips all trailing slashes before deleting
the directory portion. This case wasn't handled.
I also rewrote the method that strips the directory to use the PathBuf
methods for improved platform-indepedence.
In addition, this commit brings the behavior of `rm` better in line
with the behavior of GNU Coreutils rm, especially as regarding recursive
interactive deletion of directories. This version asks to delete files
in a different order from GNU rm, but it now gives the option of stopping
the recursion at each new directory that is reached.
This change does the following:
1. Updates the arithmetic functions in `src/factor/numeric.rs` to
correctly handle all cases up to 2^64. When numbers are larger
than 2^63, we fall back to slightly slower routines that check
for and handle overflow.
2. Since the arithmetic functions will now not overflow, we no longer
need the safety net trial division implementation. We now always
use Pollard's rho after eliminating small (<=13 bit) primes.
3. Slight tweak in `src/factor/gen_table.rs` to generate the first
1027 primes, which means we test every prime of 13 or fewer bits
before going into Pollard's rho. Includes corresponding update in
`src/factor/prime_table.rs` and the Makefile to reflect this.
4. Add a new test that generates random numbers with exclusively
large (14 to 50 bit) prime factors. This exercises the possible
overflow paths.
5. Add another new test that checks the `is_prime()` function against
a few dozen 64-bit primes. Again this is to exercise possible
overflow paths.
Add a test for `factor`.
This commit also pulls factor's Sieve implementation into its own module
so that the factor test can use it.
Finally, slight refactoring for clarity in gen_table.rs.
This commit builds upon @wikol's Pollard rho implementation.
It adds the following:
1. A generator for prime inverse tables. With these, we can do
very fast divisibility tests (a single multiply and comparison)
for small primes (presently, the first 1000 primes are in the
table, which means all numbers of ~26 bits or less can be
factored very quickly.
2. Always try prime inverse tables before jumping into Pollard's
rho method or using trial division.
3. Since we have eliminated all small factors by the time we're
done with the table division, only use slow trial division when
the number is big enough to cause overflow issues in Pollard's
rho, and jump out of trial division and into Pollard's rho as
soon as the number is small enough.
4. Updates the Makefile to regenerate the prime table if it's not
up-to-date.
The utility need a substantial rewrite due to library changes and
lifetime issues. I needed to implement the MultiWriter struct since it
was no longer available.