Commit graph

3170 commits

Author SHA1 Message Date
kwantam
8be67f7d4d fmt Knuth-Plass implementation; unicode char_width
fmt:
- Implemented Knuth-Plass optimal linebreaking strategy.
- Added commandline switch -q for "quick" (greedy) split
  mode that does not use Knuth-Plass.
- Right now, Knuth-Plass runs about half as fast. It also
  uses more memory.
- Updated fmt to use char_width (see below) instead of
  assuming each character width is 1.
- Use i64 for demerits instead of int in K-P, since int is
  pointer sized and will only be 32 bits on some
  architectures.
- incremented version number
- Incorporated improvements suggested by huonw and Arcterus.
  - K-P uses indices of linebreaks vector instead of raw
    pointers. This gets rid of a lot of allocation of boxes
    and improves safety to boot.
- Added a support module for computing displayed widths of unicode
  strings based on Markus Kuhn's free implementation at
    http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
- This is in `charwidth.rs`, but this is a temporary measure
  until the Char trait implements .width(). I am submitting
  a PR for this soon, and the code in charwidth() is what's
  generated libcore.

closes #223
2014-06-30 19:09:22 -04:00
Michael Gehring
d88f7a0dc5 seq: fix build 2014-06-30 13:26:30 +02:00
Oly Mi
8a7f96c605 Merge pull request #321 from ebfe/fix-build-master
Update rust-crypto
2014-06-30 15:06:14 +04:00
Michael Gehring
cf7da259ee Update rust-crypto 2014-06-30 12:51:30 +02:00
Arcterus
48727e4ee8 Merge pull request #319 from torkve/master
realpath and relpath implementation
2014-06-29 13:03:52 -07:00
Vsevolod Velichko
c6f75a1419 relpath implementation 2014-06-29 23:59:25 +04:00
Vsevolod Velichko
c7e93c009e realpath implementation 2014-06-29 23:57:54 +04:00
Oly Mi
b8e200489e Merge pull request #320 from Arcterus/nohup-fix
Couple of minor fixes
2014-06-29 06:49:59 +04:00
Arcterus
643d9f0f32 Remove a couple of warnings 2014-06-28 16:57:19 -07:00
Arcterus
ae4ad2bb04 Remove useless main functions and fix nohup on Macs 2014-06-28 16:45:10 -07:00
Arcterus
8fd455f8e5 Merge pull request #234 from polyphemus/cut
Implement cut - implement #165
2014-06-27 09:30:53 -07:00
polyphemus
a470c330e6 Add cut to Cargo.toml, remove cut from To Do list 2014-06-27 17:39:59 +02:00
polyphemus
798af52077 Implement fields cutting
Adds an implementation for cut_fields() and creates a separate funtion
for the --output-delimiter, for performance reasons.

This implementation relies on ::read_until() to find the newline for us
but read_until() allocates a vector every time to return it's result.
This is not ideal and should be improved upon by passing a buffer to
read().

This follows/implements the POSIX specification and all the GNU
conventions. It is a drop-in replacement for GNU cut.

One improvement to GNU is that the --delimter option takes a character
as UTF8 as apposed to single byte only for GNU.

Performance is about two times slower than that of GNU cut.

Remove ranges' sentinel value, All cut functions iterate over the ranges
and therefore it only adds an extra iteration instead of improving
performance.
2014-06-27 17:39:49 +02:00
polyphemus
0e46d453b7 Rewrite cut_characters
This follows the cut_bytes() approach of letting read_line() create a
buffer and find the newline. read_line() guarantees our buffer is a
string of utf8 characters.

When writing out the bytes segment we need to make sure we are cutting
on utf8 boundaries, there for we must iterate over the buffer
from read_line(). This implementation is(/should be) efficient as it
only iterates once over the buffer.

The previous performance was about 4x as slow as cut_bytes() and now it
is about 2x as slow as cut_bytes().
2014-06-27 17:39:49 +02:00
polyphemus
b1c2d7ac7c Rewrite cut_bytes()
Do no longer iterate over each byte and instead rely on the Buffer trait
to find the newline for us. Iterate over the ranges to specify slices of
the line which need to be printed out.

This rewrite gives a signifcant performance increase:
Old:    1.32s
mahkoh: 0.90s
New:    0.20s
GNU:    0.15s
2014-06-27 17:39:49 +02:00
polyphemus
8b1ff08bd5 Add cut_characters implementation, based on cut_bytes
This implementation uses rust's concept of characters and fails if the
input isn't valid utf-8. GNU cut implements '--characters' as an alias
for '--bytes' and thus has different semantics, for this option, from
this implemtation.
2014-06-27 17:39:49 +02:00
polyphemus
2ab586459b Add initial cut support, only bytes cutting 2014-06-27 17:39:41 +02:00
Arcterus
a5187bed7c Merge pull request #317 from redcape/fix-mem-and-text-mode
Use Less Memory and Fix Text Mode on Windows/UNIX
2014-06-26 21:55:07 -07:00
Gil Cottle
9944bdedd4 Use Less Memory and Fix Text Mode on Windows/UNIX
The following are changes to fix #303:
  1. hashsum pulls 512KB chunks of the file into memory. This ends up taking 1MB with
     a secondary buffer allocated for windows. hashsum is now able to hash files larger
     than the computer's available memory.
  2. Text no longer transforms to UTF-8. This allows hashing to work on binary files
     without specifying text mode. On Windows, it converts a Windows newline '\r\n' to
     the standard newline '\n'.
  3. Set default modes: Windows uses binary by default, all other systems use text.

Gil Cottle <gcottle@redtown.org>
2014-06-27 00:45:48 -04:00
Arcterus
cbc21642ab Merge pull request #309 from ebfe/build
make: always build multicall binary
2014-06-26 18:40:33 -07:00
Michael Gehring
30bba07f9c always build multicall binary
squashed:
	a2c6b27 - build: automatically generate main() files
	c942f0f - remove MULTICALL=1 build from travis
	cb7b35b - make: remove unnecessary shell command
	69bbb31 - update README
	03a3168 - all: move main() into separate file that links against util crate
	8276384 - make: always build multicall binary
	aa4edeb - make: avoid 'rustc --crate-file-name'
2014-06-26 10:26:16 +02:00
Arcterus
8568d41a09 Merge pull request #304 from torkve/master
nohup implementation
2014-06-25 23:46:30 -07:00
Vsevolod Velichko
3d75a9ba9d Added nohup to cargo 2014-06-26 10:41:32 +04:00
Vsevolod Velichko
ff44e28a4d README: added notice about uutils to contributions guide 2014-06-26 10:41:32 +04:00
Vsevolod Velichko
0063bb2a8c Added nohup to uutils 2014-06-26 10:41:32 +04:00
Vsevolod Velichko
3da3d7333c nohup removed from README 2014-06-26 10:41:32 +04:00
Vsevolod Velichko
9fb33699b1 nohup implementation 2014-06-26 10:41:32 +04:00
Arcterus
aba12a39f0 Merge pull request #313 from Heather/master
move sync to PROGS
2014-06-25 23:18:08 -07:00
Heather
4aa009995b move sync to PROGS 2014-06-26 10:05:31 +04:00
Oly Mi
4ebe8e0da7 Merge pull request #285 from xanderfomin/sync
sync for Windows implementation
2014-06-26 09:56:03 +04:00
Oly Mi
308a764677 Merge pull request #312 from redcape/update-rust-crypto
use latest rust-crypto for new rust master
2014-06-26 09:50:21 +04:00
Gil Cottle
26f45fb1e6 use latest rust-crypto for new rust master 2014-06-26 01:22:39 -04:00
Oly Mi
b44979cb19 Merge pull request #310 from alan-andrade/patch-6
drop stty
2014-06-26 07:04:26 +04:00
Alan Andrade
c2dd4c8f2a drop stty
Get rid of the in progress for stty.
2014-06-25 16:12:49 -07:00
Oly Mi
58d0d930eb Merge pull request #308 from ebfe/fix-build-master
Fix build with rust master
2014-06-25 15:27:55 +04:00
Michael Gehring
765ea7b6eb std::bool::to_bit was removed 2014-06-25 13:12:56 +02:00
Michael Gehring
b3c9fd891e Add type suffixes where necessary 2014-06-25 13:12:43 +02:00
Arcterus
e023d97821 Merge pull request #307 from ebfe/cargo
add Cargo.toml
2014-06-25 00:25:37 -07:00
Michael Gehring
9a09e2d756 add Cargo.toml 2014-06-25 07:35:03 +02:00
Arcterus
b08415afea Merge pull request #306 from redcape/regex-hashsum-check2
Fix for hashsum: fix file checking #305
2014-06-24 20:44:23 -07:00
Gil Cottle
5986b77e1c Fix typo and code formatting 2014-06-24 23:26:24 -04:00
Gil Cottle
16b569ee18 fix comment 2014-06-24 22:15:56 -04:00
Gil Cottle
978ee8cc3a Fix for hashsum: fix file checking #305
* Changed line verifications to use regular expressions.
* Added binary marker to output and start using the marker from
    the check file line as input to calc_sum
* Convert characters to lowercase before comparison in check

Gil Cottle <gcottle@redtown.org>
2014-06-24 21:42:58 -04:00
Oly Mi
f5dfa0a9b9 Merge pull request #301 from Arcterus/seq-broken-pipe
seq: fix broken pipe on Busybox test
2014-06-23 21:48:08 +04:00
Arcterus
3f06adfcbc seq: fix broken pipe on Busybox test 2014-06-23 09:53:28 -07:00
Oly Mi
d160c37d23 Merge pull request #300 from Arcterus/seq-busybox
seq: pass all Busybox tests
2014-06-23 12:44:15 +04:00
Arcterus
ac4b3b7103 seq: pass all Busybox tests 2014-06-23 01:34:39 -07:00
Oly Mi
9597dca983 Merge pull request #299 from Arcterus/hashsum-fix
uutils: fix hashsum
2014-06-23 12:02:49 +04:00
Arcterus
ca8077c2bc uutils: fix hashsum 2014-06-23 01:00:15 -07:00
Arcterus
54d0436069 Merge pull request #298 from ebfe/update-rust-crypto
Update rust-crypto
2014-06-23 00:56:15 -07:00