mirror of
https://github.com/nushell/nushell
synced 2024-12-28 14:03:09 +00:00
2e5a857983
# Description This PR tries to make `query web` more resilient and easier to debug with the `--inspect` parameter when trying to scrape tables. Previously it would just fail, now at least it tries to give you a hint. This is some example output now of when something went wrong. ``` ❯ http get https://en.wikipedia.org/wiki/List_of_cities_in_India_by_population | query web --as-table [Rank City 'Population(2011)[3]' 'Population(2001)[3][a]' 'State or union territory'] --inspect Passed in Column Headers = ["Rank", "City", "Population(2011)[3]", "Population(2001)[3][a]", "State or union territory"] First 2048 HTML chars = <!DOCTYPE html> <html class="client-nojs vector-feature-language-in-header-enabled vector-feature-language-in-main-page-header-disabled vector-feature-sticky-header-disabled vector-feature-page-tools-pinned-disabled vector-feature-toc-pinned-clientpref-1 vector-feature-main-menu-pinned-disabled vector-feature-limited-width-clientpref-1 vector-feature-limited-width-content-enabled vector-feature-custom-font-size-clientpref-0 vector-feature-client-preferences-disabled vector-feature-client-prefs-pinned-disabled vector-toc-available" lang="en" dir="ltr"> <head> <meta charset="UTF-8"> <title>List of cities in India by population - Wikipedia</title> <script>(function(){var className="client-js vector-feature-language-in-header-enabled vector-feature-language-in-main-page-header-disabled vector-feature-sticky-header-disabled vector-feature-page-tools-pinned-disabled vector-feature-toc-pinned-clientpref-1 vector-feature-main-menu-pinned-disabled vector-feature-limited-width-clientpref-1 vector-feature-limited-width-content-enabled vector-feature-custom-font-size-clientpref-0 vector-feature-client-preferences-disabled vector-feature-client-prefs-pinned-disabled vector-toc-available";var cookie=document.cookie.match(/(?:^|; )enwikimwclientpreferences=([^;]+)/);if(cookie){cookie[1].split('%2C').forEach(function(pref){className=className.replace(new RegExp('(^| )'+pref.replace(/-clientpref-\w+$|[^\w-]+/g,'')+'-clientpref-\\w+( |$)'),'$1'+pref+'$2');});}document.documentElement.className=className;}());RLCONF={"wgBreakFrames":false,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["", "January","February","March","April","May","June","July","August","September","October","November","December"],"wgRequestId":"9ecdad8f-2dbd-4245-b54d-9c57aea5ca45","wgCanonicalNamespace":"","wgCanonicalSpecialPageName":false,"wgNamespaceNumber":0,"wgPageName":"List_of_cities_in_India_by_population","wgTitle":"List of cities in India by population","wgCurRevisionId":1192093210,"wgRev Potential HTML Headers = ["City", "Population(2011)[3]", "Population(2001)[3][a]", "State or unionterritory", "Ref"] Potential HTML Headers = ["City", "Population(2011)[5]", "Population(2001)", "State or unionterritory"] Potential HTML Headers = [".mw-parser-output .navbar{display:inline;font-size:88%;font-weight:normal}.mw-parser-output .navbar-collapse{float:left;text-align:left}.mw-parser-output .navbar-boxtext{word-spacing:0}.mw-parser-output .navbar ul{display:inline-block;white-space:nowrap;line-height:inherit}.mw-parser-output .navbar-brackets::before{margin-right:-0.125em;content:\"[ \"}.mw-parser-output .navbar-brackets::after{margin-left:-0.125em;content:\" ]\"}.mw-parser-output .navbar li{word-spacing:-0.125em}.mw-parser-output .navbar a>span,.mw-parser-output .navbar a>abbr{text-decoration:inherit}.mw-parser-output .navbar-mini abbr{font-variant:small-caps;border-bottom:none;text-decoration:none;cursor:inherit}.mw-parser-output .navbar-ct-full{font-size:114%;margin:0 7em}.mw-parser-output .navbar-ct-mini{font-size:114%;margin:0 4em}vtePopulation of cities in India"] Potential HTML Headers = ["vteGeography of India"] ╭──────────────────────────┬─────────────────────────────────────────────────────╮ │ Rank │ error: no data found (column name may be incorrect) │ │ City │ error: no data found (column name may be incorrect) │ │ Population(2011)[3] │ error: no data found (column name may be incorrect) │ │ Population(2001)[3][a] │ error: no data found (column name may be incorrect) │ │ State or union territory │ error: no data found (column name may be incorrect) │ ╰──────────────────────────┴─────────────────────────────────────────────────────╯ ``` The key here is to look at the `Passed in Column Headers` and compare them to the `Potential HTML Headers` and couple that with the error table at the bottom should give you a hint that, in this situation, wikipedia has changed the column names, yet again. So we need to update our query web statement's tables to get closer to what we want. ``` ❯ http get https://en.wikipedia.org/wiki/List_of_cities_in_India_by_population | query web --as-table [City 'Population(2011)[3]' 'Population(2001)[3][a]' 'State or unionterritory' 'Ref'] ╭─#──┬───────City───────┬─Population(2011)[3]─┬─Population(2001)[3][a]─┬─State or unionterritory─┬──Ref───╮ │ 0 │ Mumbai │ 12,442,373 │ 11,978,450 │ Maharashtra │ [3] │ │ 1 │ Delhi │ 11,034,555 │ 9,879,172 │ Delhi │ [3] │ │ 2 │ Bangalore │ 8,443,675 │ 5,682,293 │ Karnataka │ [3] │ │ 3 │ Hyderabad │ 6,993,262 │ 5,496,960 │ Telangana │ [3] │ │ 4 │ Ahmedabad │ 5,577,940 │ 4,470,006 │ Gujarat │ [3] │ │ 5 │ Chennai │ 4,646,732 │ 4,343,645 │ Tamil Nadu │ [3] │ │ 6 │ Kolkata │ 4,496,694 │ 4,580,546 │ West Bengal │ [3] │ │ 7 │ Surat │ 4,467,797 │ 2,788,126 │ Gujarat │ [3] │ │ 8 │ Pune │ 3,124,458 │ 2,538,473 │ Maharashtra │ [3] │ │ 9 │ Jaipur │ 3,046,163 │ 2,322,575 │ Rajasthan │ [3] │ │ 10 │ Lucknow │ 2,817,105 │ 2,185,927 │ Uttar Pradesh │ [3] │ │ 11 │ Kanpur │ 2,765,348 │ 2,551,337 │ Uttar Pradesh │ [3] │ │ 12 │ Nagpur │ 2,405,665 │ 2,052,066 │ Maharashtra │ [3] │ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use std testing; testing run-tests --path crates/nu-std"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
---|---|---|
.. | ||
nu-cli | ||
nu-cmd-base | ||
nu-cmd-dataframe | ||
nu-cmd-extra | ||
nu-cmd-lang | ||
nu-color-config | ||
nu-command | ||
nu-engine | ||
nu-explore | ||
nu-glob | ||
nu-json | ||
nu-lsp | ||
nu-parser | ||
nu-path | ||
nu-plugin | ||
nu-pretty-hex | ||
nu-protocol | ||
nu-std | ||
nu-system | ||
nu-table | ||
nu-term-grid | ||
nu-test-support | ||
nu-utils | ||
nu_plugin_custom_values | ||
nu_plugin_example | ||
nu_plugin_formats | ||
nu_plugin_gstat | ||
nu_plugin_inc | ||
nu_plugin_python | ||
nu_plugin_query | ||
README.md |
Nushell core libraries and plugins
These sub-crates form both the foundation for Nu and a set of plugins which extend Nu with additional functionality.
Foundational libraries are split into two kinds of crates:
- Core crates - those crates that work together to build the Nushell language engine
- Support crates - a set of crates that support the engine with additional features like JSON support, ANSI support, and more.
Plugins are likewise also split into two types:
- Core plugins - plugins that provide part of the default experience of Nu, including access to the system properties, processes, and web-connectivity features.
- Extra plugins - these plugins run a wide range of different capabilities like working with different file types, charting, viewing binary data, and more.