To support unaligned output buffers (i.e. 'in' in the terminology of
the SPI framework), this change splits each 16bit FIFO element after
reading and writes them to memory in two 8bit transactions. With this
change, we can now always use the optimised mode for receive-only
transcations independent on the alignment of the target buffer.
Given that we'll run with caches on, the impact should be negligible:
as expected, this has no adverse impact on throughput if running with
a 960MHz LPLL configuration.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
The SPI controller's documentation (I only had access to the RK3399,
RK3368 and PX30 TRMs) specifies that, when operating in master-mode,
the controller will stop the SCLK to avoid RXFIFO overruns and TXFIFO
underruns. Looks like my worries that we'd need to support DMA-330
(aka PL330) to make any further progress were unfounded.
This adds a driver-data structure to capture hardware-specific
settings of individual controller instances (after all, we don't know
if all versions are well-behaved) and adds a 'master_manages_fifo'
flag to it. The first use of said flag is in the optimised
receive-only transfer-handler, which can now request 64Kframe
(i.e. 128KByte) bursts of data on each reprogramming of CTRLR1
(i.e. every time through the loop).
This improves throughput to 46.85MBit/s (a 94.65% bus-utilisation).
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
For the RK3399-Q7 we recommend storing SPL and u-boot.itb in the
on-module 32MBit (and sometimes even larger, if requested as part of a
configure-to-order configuration) SPI-NOR flash that is clocked for a
bitrate of 49.5MBit/s and connected in a single-IO configuration (the
RK3399 only supports single-IO for SPI).
Unfortunately, the existing SPI driver is excruciatingly slow at
reading out large chunks of data (in fact it is just as slow for small
chunks of data, but the overheads of the driver-framework make it less
noticeable): before this change, the throughput on a 4MB read from
SPI-NOR is 8.47MBit/s which equates a 17.11% bus-utilisation.
To improve on this, this commit adds an optimised receive-only
transfer (i.e.: out == NULL) handler that hooks into the main transfer
function and processes data in 16bit frames (utilising the full with
of each FIFO element). As of now, the receive-only handler requires
the in-buffer to be 16bit aligned. Any lingering data (i.e. either if
the in-buffer was not 16-bit aligned or if an odd number of bytes are
to be received) will be handled by the original 8bit reader/wirter.
Given that the SPI controller's documentation does not guarantuee any
interlocking between the RXFIFO and the master SCLK, the transfer loop
will be restarted for each chunk of 32 frames (i.e. 64 bytes).
With this new receive-only transfer handler, the throughput for a 4MB
read increases to 36.28MBit/s (i.e. 73.29% bus-utilisation): this is a
4x improvement over the baseline.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Reported-by: Klaus Goger <klaus.goger@theobroma-systems.com>
Series-Cc: Klaus Goger <klaus.goger@theobroma-systems.com>
Series-Cc: Christoph Muellner <christoph.muellner@theobroma-systems.com>
The logic in the main transmit loop took a bit of reading the TRM to
fully understand (due to silent assumptions based in internal logic):
the "wait until idle" at the end of each iteration through the loop is
required for the transmit-path as each clearing of the ENA register
(to update run-length in the CTRLR1 register) will implicitly flush
the FIFOs... transmisson can therefore not overlap loop iterations.
This change adds a comment to clarify the reason/need for waiting
until the controller becomes idle and wraps the entire check into an
'if (out)' to make it clear that this is required for transfers with a
transmit-component only (for transfers having a receive-component,
completion of the transmit-side is trivially ensured by having
received the correct number of bytes).
The change does not increase execution time measurably in any of my
tests.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
While rkspi_enable_chip is called with true/false everywhere else in
the file, one call site uses '0' to denot 'false'.
This change this one parameter to 'false' and effects consistency.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
The maximum transfer length (in a single transaction) for the Rockchip
SPI controller is 64Kframes (i.e. 0x10000 frames) of 8bit or 16bit
frames and is encoded as (num_frames - 1) in CTRLR1. The existing
code subtracted the "minus 1" twice for a maximum transfer length of
0xffff (64K - 1) frames.
While this is not strictly an error (the existing code is correct, but
leads to a bit of head-scrating), fix this off-by-one situation.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Even though the priv-structure and the claim-bus function contain
logic for 16bit frames and for unidirectional transfer modes, neither
of these is used anywhere in the driver.
This removes the unused (as in "has no effect") logic and fields.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
In analysing delays introduced for large SPI reads, the absence of any
indication when a delay was inserted (to ensure the CS toggling is
observed by devices) became apparent.
Add an additional debug-only debug message to record the insertion and
duration of any delay (note that the debug-message will cause a delay
on-top of the delay-duration).
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
When U-Boot started using SPDX tags we were among the early adopters and
there weren't a lot of other examples to borrow from. So we picked the
area of the file that usually had a full license text and replaced it
with an appropriate SPDX-License-Identifier: entry. Since then, the
Linux Kernel has adopted SPDX tags and they place it as the very first
line in a file (except where shebangs are used, then it's second line)
and with slightly different comment styles than us.
In part due to community overlap, in part due to better tag visibility
and in part for other minor reasons, switch over to that style.
This commit changes all instances where we have a single declared
license in the tag as both the before and after are identical in tag
contents. There's also a few places where I found we did not have a tag
and have introduced one.
Signed-off-by: Tom Rini <trini@konsulko.com>
We have a large number of places where while we historically referenced
gd in the code we no longer do, as well as cases where the code added
that line "just in case" during development and never dropped it.
Signed-off-by: Tom Rini <trini@konsulko.com>
fix typo
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Acked-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Reviewed-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Update the Rockchip SPI driver to support a live device tree.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Reviewed-by: Simon Glass <sjg@chromium.org>
Reviewed-by: Jagan Teki <jagan@openedev.com>
Acked-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
For the RK3368, we can reuse the SPI driver (although we'll have to
eventually investigate whether it can be merged with the
designware_spi.c driver) also used for the RK3288 and RK3399.
This adds the necessary compatible string to support the RK3368.
Note that the assumption that GPLL will be clocked at 594MHz is not
true for the RK3368, but this will not lead to incorrect functioning
(just to a lower-than-expected SPI operating frequency): this has been
documented in the driver, so it doesn't cause any headaches when
someone next needs to touch the clock code of this driver.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Reviewed-by: Simon Glass <sjg@chromium.org>
With the new dev_read functions available, we can convert the rockchip
architecture-specific drivers and common drivers used by these devices
over to the dev_read family of calls.
This change covers the rk_spi.c (SPI driver) used in Rockchip devices.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Reviewed-by: Simon Glass <sjg@chromium.org>
These support the flat device tree. We want to use the dev_read_..()
prefix for functions that support both flat tree and live tree. So rename
the existing functions to avoid confusion.
In the end we will have:
1. dev_read_addr...() - works on devices, supports flat/live tree
2. devfdt_get_addr...() - current functions, flat tree only
3. of_get_address() etc. - new functions, live tree only
All drivers will be written to use 1. That function will in turn call
either 2 or 3 depending on whether the flat or live tree is in use.
Note this involves changing some dead code - the imx_lpi2c.c file.
Signed-off-by: Simon Glass <sjg@chromium.org>
The existing Rockchip SPI (rk_spi.c) driver also matches the hardware
block found in the RK3399. This has been confirmed both with SPI NOR
flashes and general SPI transfers on the RK3399-Q7 for SPI1 and SPI5.
This change adds the 'rockchip,rk3399-spi' string to its compatible
list to allow reuse of the existing driver.
X-AffectedPlatforms: RK3399-Q7
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Tested-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
Acked-by: Simon Glass <sjg@chromium.org>
The baudrate in rkspi was calculated by using an integer division
(which implicitly discarded any fractional result), then rounding to
an even number and finally clamping to 0xfffe using a bitwise AND
operator. This introduced two issues:
1) for very small baudrates (overflowing the 0xfffe range), the
bitwise-AND generates rather random-looking (wildly varying)
actual output bitrates
2) for higher baudrates, the calculation tends to 'err towards a
higher baudrate' with the actual error increasing as the dividers
become very small. E.g., with a 99MHz input clock, a request
for a 20MBit baudrate (99/20 = 4.95), a 24.75 MBit would be use
(which amounts to a 23.75% error)... for a 34 MBit request this
would be an actual outbout of 49.5 Mbit (i.e. a 45% error).
This change rewrites the divider selection (i.e. baudrate calculation)
by making sure that
a) for the normal case: the largest representable baudrate below the
requested rate will be chosen;
b) for the denormal case (i.e. when the divider can no longer be
represented), the lowest representable baudrate is chosen.
Even though the denormal case (b) may be of little concern in real
world applications (even with a 198MHz input clock, this will only
happen at below approx. 3kHz/3kBit), our board-verification team kept
complaining.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Tested-by: Klaus Goger <klaus.goger@theobroma-systems.com>
The original clock/bitrate selection code for the rk_spi driver was a
bit limited, as it always selected a 99MHz input clock rate (which
would allow for a maximum bitrate of 49.5MBit/s), but returned -EINVAL
if a bitrate higher than 48MHz was requested.
To give us better control over the bitrate (i.e. add more operating
points, especially at "higher" bitrate---such as above 9MBit/s), we
try to choose 4x the maximum frequency (clamped to 50MBit) from the
DTS instead of 99MHz... for most use-cases this will yield a frequency
of 198MHz, but is flexible to go beyond this in future configurations.
This also rewrites the check to allow frequencies of up to half the
SPI module rate as bitrates and then clamps to whatever the DTS allows
as a maximum (board-specific) frequency and does away with the -EINVAL
when trying to select a bitrate (for cases that exceeded the hard
limit) and instead consistently clamps to the lower of the hard limit,
the soft limit for the SPI bus (from the DTS) or the soft limit for
the SPI slave device.
This replaces
"rockchip: spi: rk_spi: select 198MHz input to the SPI module for the RK3399"
"rockchip: spi: rk_spi: improve clocking code for the RK3399"
from earlier versions of this series.
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
At present devices use a simple integer offset to record the device tree
node associated with the device. In preparation for supporting a live
device tree, which uses a node pointer instead, refactor existing code to
access this field through an inline function.
Signed-off-by: Simon Glass <sjg@chromium.org>
Now, arch/${ARCH}/include/asm/errno.h and include/linux/errno.h have
the same content. (both just wrap <asm-generic/errno.h>)
Replace all include directives for <asm/errno.h> with <linux/errno.h>.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
[trini: Fixup include/clk.]
Signed-off-by: Tom Rini <trini@konsulko.com>
The following changes are made to the clock API:
* The concept of "clocks" and "peripheral clocks" are unified; each clock
provider now implements a single set of clocks. This provides a simpler
conceptual interface to clients, and better aligns with device tree
clock bindings.
* Clocks are now identified with a single "struct clk", rather than
requiring clients to store the clock provider device and clock identity
values separately. For simple clock consumers, this isolates clients
from internal details of the clock API.
* clk.h is split so it only contains the client/consumer API, whereas
clk-uclass.h contains the provider API. This aligns with the recently
added reset and mailbox APIs.
* clk_ops .of_xlate(), .request(), and .free() are added so providers
can customize these operations if needed. This also aligns with the
recently added reset and mailbox APIs.
* clk_disable() is added.
* All users of the current clock APIs are updated.
* Sandbox clock tests are updated to exercise clock lookup via DT, and
clock enable/disable.
* rkclk_get_clk() is removed and replaced with standard APIs.
Buildman shows no clock-related errors for any board for which buildman
can download a toolchain.
test/py passes for sandbox (which invokes the dm clk test amongst
others).
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Simon Glass <sjg@chromium.org>
The correct pinctrl is handled automatically so we don't need to do it in
the driver. The exception is when we want to use a different chip select
(other than 0). But this isn't used at present.
Signed-off-by: Simon Glass <sjg@chromium.org>
At present there is an incorrect call to rkspi_enable_chip(). It should
be disabling the chip, not enabling it. Correct this and ensure that the
chip is disabled when releasing the bus.
Signed-off-by: Simon Glass <sjg@chromium.org>
Some devices need delays before and after activiation. Implement these
features in the SPI driver so that we will be able to enable the Chrome
OS EC.
Signed-off-by: Simon Glass <sjg@chromium.org>
Two of the init values are created locally so cannot be out of range.
The masking is unnecessary and in one case is incorrect. Fix it.
Signed-off-by: Simon Glass <sjg@chromium.org>
Rather than changing the clock to the same value on every transaction,
remember the last value and don't adjust the clock unless it is necessary.
Signed-off-by: Simon Glass <sjg@chromium.org>
If full pinctrl is enabled we don't need to manually set the pinctrl in the
driver. It will happen automatically. Adjust the code to suit - we will
still use manual mode in SPL.
Signed-off-by: Simon Glass <sjg@chromium.org>
Add a SPI driver for the Rockchip RK3288, using driver model. It should work
for other Rockchip SoCs also.
Signed-off-by: Simon Glass <sjg@chromium.org>