completions/apt: Read from the dpkg cache directly

I have no idea why `apt-cache --no-generate show` is so slow since it basically
dumps the contents of the cache file located at `/var/lib/dpkg/status`. We are
technically bypassing any waits on the cache lock file so this may produce
incorrect results if the cache is being regenerated in the moment, but that's a
small price to pay and the results are likely confined to simply not generating
comprehensive results.

With this change, we no longer need to truncate results to the first n matches
and we no longer only print packages beginning with the commandline argument
enabling fish's partial completions logic to offer less-perfect suggestions when
no better options are available.

Even though we are generating more usable completions, we still trounce the old
performance by leaps and bounds:

```
Benchmark #1: fish -c "complete -C\"apt install ac\""
  Time (mean ± σ):      2.165 s ±  0.033 s    [User: 267.0 ms, System: 1932.2 ms]
  Range (min … max):    2.136 s …  2.256 s    10 runs

Benchmark #2: build/fish -c "complete -C\"apt install ac\""
  Time (mean ± σ):     111.1 ms ±   1.8 ms    [User: 38.9 ms, System: 72.9 ms]
  Range (min … max):   108.2 ms … 114.9 ms    26 runs

Summary
  'build/fish -c "complete -C\"apt install ac\""' ran
   19.49 ± 0.44 times faster than 'fish -c "complete -C\"apt install ac\""'
```
This commit is contained in:
Mahmoud Al-Qudsi 2023-02-05 16:17:08 -06:00
parent 6f3711902b
commit 96deaae7d8
2 changed files with 50 additions and 28 deletions

View file

@ -33,9 +33,10 @@ end
complete -c apt -f
complete -k -c apt -n "__fish_seen_subcommand_from $pkg_subcmds" -a '(__fish_print_apt_packages | string match -re -- "(?:\\b|_)"(commandline -ct | string escape --style=regex) | head -n 250 | sort)'
complete -c apt -n "__fish_seen_subcommand_from $installed_pkg_subcmds" -a '(__fish_print_apt_packages --installed | string match -re -- "(?:\\b|_)"(commandline -ct | string escape --style=regex) | head -n 250)'
complete -k -c apt -n "__fish_seen_subcommand_from $handle_file_pkg_subcmds" -a '(__fish_complete_suffix .deb)'
# We use -k to keep PWD directories (from the .deb completion) after packages, so we need to sort the packages
complete -k -c apt -n "__fish_seen_subcommand_from $handle_file_pkg_subcmds" -kxa '(__fish_complete_suffix .deb)'
complete -k -c apt -n "__fish_seen_subcommand_from $pkg_subcmds" -kxa '(__fish_print_apt_packages | sort)'
complete -c apt -n "__fish_seen_subcommand_from $installed_pkg_subcmds" -kxa '(__fish_print_apt_packages --installed | sort)'
complete -c apt -n "__fish_seen_subcommand_from install" -l no-install-recommends
# This advanced flag is the safest way to upgrade packages that otherwise would have been kept back

View file

@ -7,32 +7,53 @@ function __fish_print_apt_packages
return
end
type -q -f apt-cache || return 1
set -l search_term (commandline -ct | string replace -ar '[\'"\\\\]' '' | string lower)
if ! test -f /var/lib/dpkg/status
return 1
end
# Do not not use `apt-cache` as it is sometimes inexplicably slow (by multiple orders of magnitude).
if not set -q _flag_installed
# Do not generate the cache as apparently sometimes this is slow.
# http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=547550
# (It is safe to use `sed -r` here as we are guaranteed to be on a GNU platform
# if apt-cache was found.)
# Uses the UTF-8/ASCII record separator (0x1A) character.
#
# Note: This can include "Description:" fields which we need to include,
# "Description-en_GB" (or another locale code) fields which we need to include
# as well as "Description-md5" fields which we absolutely do *not* want to include
# The regex doesn't allow numbers, so unless someone makes a hash algorithm without a number
# in the name, we're safe. (yes, this should absolutely have a better format).
#
# aptitude has options that control the output formatting, but is orders of magnitude slower
#
# sed could probably do all of the heavy lifting here, but would be even less readable
#
# The `head -n 500` causes us to stop once we have 500 lines. We do it after the `sed` because
# Debian package descriptions can be extremely long and are hard-wrapped: texlive-latex-extra
# has about 2700 lines on Debian 11.
apt-cache --no-generate show '.*'(commandline -ct)'.*' 2>/dev/null | sed -r '/^(Package|Description-?[a-zA-Z_]*):/!d;s/Package: (.*)/\1\t/g;s/Description-?[^:]*: (.*)/\1\x1a\n/g' | head -n 500 | string join "" | string replace --all --regex \x1a+ \n | uniq
return 0
awk -e '
BEGIN {
FS=": "
}
/^Package/ {
pkg=$2
}
/^Description(-[a-zA-Z]+)?:/ {
desc=$2
if (index(pkg, "'$search_term'") > 0) {
print pkg "\t" desc
}
pkg="" # Prevent multiple description translations from being printed
}' < /var/lib/dpkg/status
else
set -l packages (dpkg --get-selections | string replace -fr '(\S+)\s+install' "\$1" | string match -e (commandline -ct))
apt-cache --no-generate show $packages 2>/dev/null | sed -r '/^(Package|Description-?[a-zA-Z_]*):/!d;s/Package: (.*)/\1\t/g;s/Description-?[^:]*: (.*)/\1\x1a\n/g' | head -n 500 | string join "" | string replace --all --regex \x1a+ \n | uniq
return 0
awk -e '
BEGIN {
FS=": "
}
/^Package/ {
pkg=$2
}
/^Status/ {
installed=0
if ($2 ~ /(^|\s)installed/) {
installed=1
}
}
/^Description(-[a-zA-Z]+)?:/ {
desc=$2
if (installed == 1 && index(pkg, "'$search_term'") > 0) {
print pkg "\t" desc
installed=0 # Prevent multiple description translations from being printed
}
}' < /var/lib/dpkg/status
end
end