As reported in STAR 9001165532, an SLC control reg read (for checking
busy state) right after SLC invalidate command may incorrectly return
NOT busy causing software to NOT spin-wait while operation is underway.
(and for some reason this only happens if L1 cache is also disabled - as
required by IOC programming model)
Suggested workaround is to do an additional Control Reg read, which
ensures the 2nd read gets the right status.
Same fix made in Linux kernel:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c70c473396cbdec1168a6eff60e13029c0916854
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Even though we expect only master core to execute U-Boot code
let's make sure even if for some reason slave cores attempt to
execute U-Boot in parallel with master they get halted very early.
If platform wants it may kick-start slave cores before passing control
to say Linux kernel or any other application that want to see all cores
of SMP SoC up and running.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
This commit replaces legacy timer code with usage of arc timer
driver.
It removes arch/arc/lib/time.c file and selects CONFIG_CLK,
CONFIG_TIMER and CONFIG_ARC_TIMER options for all ARC boards by default.
Therefore we remove CONFIG_CLK option from less common axs101 and
axs103 defconfigs.
Also it removes legacy CONFIG_SYS_TIMER_RATE config symbol from
axs10x.h, tb100.h and nsim.h configs files as it is no longer required.
Signed-off-by: Vlad Zakharov <vzakhar@synopsys.com>
Reviewed-by: Simon Glass <sjg@chromium.org>
Commit e2f88dfd2d ("libfdt: Introduce new ARCH_FIXUP_FDT option")
allows us to skip memory setup of DTB, but a problem for ARM is that
spin_table_update_dt() and psci_update_dt() are skipped as well if
CONFIG_ARCH_FIXUP_FDT is disabled.
This commit allows us to skip only fdt_fixup_memory_banks() instead
of the whole of arch_fixup_fdt(). It will be useful when we want to
use a memory node from a kernel DTB as is, but need some fixups for
Spin-Table/PSCI.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Acked-by: Simon Glass <sjg@chromium.org>
Fixed build error for x86:
Signed-off-by: Simon Glass <sjg@chromium.org>
Starting from arc-2016.03 GNU tools linker properly works with
symbols defined in linker script and so external declarations
are no longer required, dump them.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Initially IVT for ARCv2 was simply copypasted from ARCompact
with some selected fixes so basic stuff works.
Now we update it with more ARCv2 specific vectors like
* Software Interrupt
* Division by zero
* Data cache consistency error
* Misaligned access
Also normal interrupts are now implemented properly and extened to
all possible 240 items.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
This might be useful to make sure relocation fixups really
happen. And since this info gets printed only in DEBUG
build it doesn't really hurt normal execution.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
According to ARC HS databook it is required to flush and disable
caches prior programming IOC registers. Otherwise ongoing coherent
memory operations may not observe the coherency protocols as
expected.
But since in ARC HS v2.1 there's no way to disable SLC (AKA L2 cache)
we're doing our best flushing and invalidating it.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
invalidate_dcache_all() could be used in different use-cases
and what is especially important most of those cases won't be
related to DMAed data to or from peripherals, i.e. we'll be doing
invalidation of data used purely by CPU cores.
Given that IOC engine only snoops data that goes through DMA
we need to care ourselves about data used only by CPU cores
and so remove dependency on IOC from invalidate_dcache_all()
and always do real invalidation.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
flush_dcache_all() is used in the very end of U-Boot self relocation
to write back all copied and then patched code and data to their
new location in the very end of available memory space.
Since that has nothing to do with IO (i.e. no external DMA happens
here) IOC won't help here and we need to write back data cache contents
manually.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
ISS is obsolete now and nSIM is used for simulation instead.
In its turn nSIM properly handles baud-rate settings so get rid
of now useless check.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
With release of ARC HS38 v2.1 new IO coherency engine could be built-in
ARC core. This hardware module ensures coherency between DMA-ed data
from peripherals and L2 cache.
With L2 and IOC enabled there's no overhead for L2 cache manual
maintenance which results in significantly improved IO bandwidth.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
ARC core could be configured with different L1 and L2 (AKA SLC) cache
line lengths. At least these values are possible and were really used:
32, 64 or 128 bytes.
Current implementation requires cache line to be selected upon U-Boot
configuration and then it will only work on matching hardware. Indeed
this is quite efficient because cache line length gets hardcoded during
code compilation. But OTOH it makes binary less portable.
With this commit we allow U-Boot to determine real L1 cache line length
early in runtime and use this value later on. This extends portability
of U-Boot binary a lot.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
board_init_f_mem() alters the C runtime environment's
stack it is actually already using. This is not a valid
behaviour within a C runtime environment.
Split board_init_f_mem into C functions which do not alter
their own stack and always behave properly with respect to
their C runtime environment.
Signed-off-by: Albert ARIBAUD <albert.u.boot@aribaud.net>
Acked-by: Thomas Chou <thomas@wytron.com.tw>
[1] Align cache management functions to those in Linux kernel. I.e.:
a) Use the same functions for all cache ops (D$ Inv/Flush)
b) Split cache ops in 3 sub-functions: "before", "lineloop" and
"after". That way we may re-use "before" and "after" functions for
region and full cache ops.
[2] Implement full-functional L2 (SLC) management. Before SLC was
simply disabled early on boot. It's also possible to enable or disable
L2 cache from config utility.
[3] Disable/enable corresponding caches early on boot. So if U-Boot is
configured to use caches they will be used at all times (this is useful
in partucular for speed-up of relocation).
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
With new SMP-enabled CPUs with ARC HS38 cores and corresponding support
in Linux kernel it's required to add basic SMP support in U-Boot.
Currently we assume the one and only core starts execution after
power-on. So most of things in U-Boot is handled in UP mode.
But when U-Boot is used for loading and starting Linux kernel right
before jumping to kernel's entry point U-Boot:
[1] Sets all slave cores to jump to the same address [kernel's entry
point]
[2] Really starts all slav cores
In ARC's implemetation of SMP in Linux kernel all cores are supposed to
run the same start-up code. But only core with ID 0 (master core)
processes further while others are looping waiting for master core to
complete some initialization.
That means it's safe to un-pause slave cores and let them execute kernel
- they will wait for master anyway.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
While testing "arc: make sure _start is in the beginning of .text
section" I haven't done proper clean-up of built binaries and so missed
another tiny bit that lead to the following error:
--->8---
LD u-boot
arc-linux-ld.bfd: cannot find arch/arc/lib/start.o
Makefile:1107: recipe for target 'u-boot' failed
make: *** [u-boot] Error 1
--->8---
Fix is trivial: put "start.o" in "extra-y".
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
This is important to have entry point in the beginning of .text section
because it allows simple loading and execution of U-Boot.
For example pre-bootloader loads U-Boot in memory starting from offset
0x81000000 and then just jumps to the same address.
Otherwise pre-bootloader would need to find-out where entry-point is. In
its turn if it deals with binary image of U-Boot there's no way for
pre-bootloader to get required value.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
ARCv2 cores may have built-in SLC (System Level Cache, AKA L2-cache).
This change adds functions required for controlling SLC:
* slc_enable/disable
* slc_flush/invalidate
For now we just disable SLC to escape DMA coherency issues until either:
* SLC flush/invalidate is supported in DMA APIin U-Boot
* hardware DMA coherency is implemented (that might be board specific
so probably we'll need to have a separate Kconfig option for
controlling SLC explicitly)
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
As discussed on mailing list we're drifting away from
CONFIG_SYS_GENERIC_GLOBAL_DATA in favour to use of board_init_f_mem()
for global data.
So do this for ARC architecture.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Intention behind this work was elimination of as much assembly-written
code as it is possible.
In case of ARC we already have relocation fix-up implemented in C so why
don't we use C for U-Boot copying, .bss zeroing etc.
It turned out x86 uses pretty similar approach so we re-used parts of
code in "board_f.c" initially implemented for x86.
Now assembly usage during init is limited to stack- and frame-pointer
setup before and after relocation.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Simon Glass <sjg@chromium.org>
This separation makes maintenance of code easier because those low-level
interrupt- or exception handling routines are pretty static and usually
require not much care while start-up code is a subject of modifications
and enhancements.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Even though ARCompact and ARCv2 are not binary compatible most of
assembly instructions are used in both. With this change we'll get rid
of duplicate code.
Still IVTs are implemented differently so we're keeping them in separate
files.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
always
Make both invalidate_icache_all() and invalidate_dcache_all() available
even if U-Boot is configured with CONFIG_SYS_DCACHE_OFF and/or
CONFIG_SYS_ICACHE_OFF.
This is useful because configuration of U-Boot may not match actual
hardware features. Real board may have cache(s) but for some reason we
may want to run U-Boot with cache(s) disabled (for example if some
peripherals work improperly with existing drivers if data cache is
enabled). So board may start with cache(s) enabled (that's the case for
ARC cores with built-in caches) but early in U-Boot we disable cache(s)
and make sure all contents of data cache gets flushed in RAM.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
This way we may have very limited set of functions implemented so we
save some space.
Also it allows us to build U-Boot for any ARC core with the same one
toolchain because we don't rely on pre-built libgcc.
For example:
* we may use little-endian toolchain but build U-Boot for ether
endianess
* we may use non-multilibbed uClibc toolchain but build U-Boot for
whatever ARC CPU flavour that current GCC supports
Private libgcc built from generic C implementation contributes only 144
bytes to .text section so we don't see significant degradation of size:
--->8---
$ arc-linux-size u-boot.libgcc-prebuilt
text data bss dec hex filename
222217 24912 214820 461949 70c7d u-boot.libgcc-prebuilt
$ arc-linux-size u-boot.libgcc-private
text data bss dec hex filename
222361 24912 214820 462093 70d0d u-boot.libgcc-private
--->8---
Also I don't notice visible performance degradation compared to
pre-built libgcc (where at least "*div*" functions are had-written in
assembly) on typical operations of downloading 10Mb uImage over TFTP and
bootm.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
This change allows to keep board description clean and minimalistic.
This is especially helpful if one board may house different CPUs with
different features.
It is applicable to both FPGA-based boards or those that have CPUs
mounted on interchnagable daughter-boards.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
It's prohibited to put branch instruction in the very end of zero-delay
loop. On execution this causes "Illegal instruction" exception.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Igor Guryanov <guryanov@synopsys.com>
"reset.c" and "cpu.c" have no architecture-specific code at all.
Others are applicable to either ARC CPU.
This change is a preparation to submission of ARCv2 architecture port.
Even though ARCv1 and ARCv2 ISAs are not binary compatible most of
built-in modules still have the same programming model - AUX registers
are mapped in the same addresses and hold the same data (new featues
extend existing ones).
So only low-level assembly code (start-up, interrupt handlers) is left
as CPU(actually ISA)-specific. This significantyl simplifies maintenance
of multiple CPUs/ISAs.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Igor Guryanov <guryanov@synopsys.com>
* use better symbols for relocatable region boundaries
("__image_copy_start" instead of "CONFIG_SYS_TEXT_BASE")
* remove useless debug messages because they will only show up in case
of both problem (when normal "if" branch won't be taken) and DEBUG take
place which is pretty rare situation.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Even though existing implementation works fine in preparation to
submission of ARCv2 architecture we need this change.
In case of ARCv2 interrupt vector table consists of just addresses
of corresponding handlers. And if those addresses will be in .text
section then assembler will encode them as everything in .text section
as middle-endian and then on real execution CPU will read swapped
addresses and will jump into the wild.
Once introduced new section is situated so .text section remains the
first which allows us to use common linker option for linking everything
to a specified CONFIG_SYS_TEXT_BASE.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Igor Guryanov <guryanov@synopsys.com>
In case of little-endian ARC700 instructions (which may include target
address) are encoded as middle-endian. That's why it's required to swap
bytes after read and ten right before write back.
But in case of big-endian ARC700 instructions are encoded as a plain
big-endian. Thus no need for byte swapping.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Francois Bedard <fbedard@synopsys.com>
Cc: Tom Rini <trini@ti.com>
cc: Noam Camus <noamc@ezchip.com>
These are library functions used by ARC700 architecture.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Francois Bedard <fbedard@synopsys.com>
Cc: Wolfgang Denk <wd@denx.de>
Cc: Heiko Schocher <hs@denx.de>