Commit graph

26 commits

Author SHA1 Message Date
Alexey Brodkin
a4a43fcf9c arc/cache: Flush & invalidate all caches right before enabling IOC
According to ARC HS databook it is required to flush and disable
caches prior programming IOC registers. Otherwise ongoing coherent
memory operations may not observe the coherency protocols as
expected.

But since in ARC HS v2.1 there's no way to disable SLC (AKA L2 cache)
we're doing our best flushing and invalidating it.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2016-06-13 14:38:05 +02:00
Alexey Brodkin
bd91508b50 arc/cache: really do invalidate_dcache_all() even if IOC exists
invalidate_dcache_all() could be used in different use-cases
and what is especially important most of those cases won't be
related to DMAed data to or from peripherals, i.e. we'll be doing
invalidation of data used purely by CPU cores.

Given that IOC engine only snoops data that goes through DMA
we need to care ourselves about data used only by CPU cores
and so remove dependency on IOC from invalidate_dcache_all()
and always do real invalidation.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2016-06-13 14:38:05 +02:00
Alexey Brodkin
2a8382c6fe arc/cache: really do flush_dcache_all() even if IOC exists
flush_dcache_all() is used in the very end of U-Boot self relocation
to write back all copied and then patched code and data to their
new location in the very end of available memory space.

Since that has nothing to do with IO (i.e. no external DMA happens
here) IOC won't help here and we need to write back data cache contents
manually.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2016-04-21 20:09:59 +03:00
Alexey Brodkin
8b15010b1f arc: get rid of running_on_hw
ISS is obsolete now and nSIM is used for simulation instead.
In its turn nSIM properly handles baud-rate settings so get rid
of now useless check.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2016-04-11 10:20:30 -07:00
Alexey Brodkin
db6ce2312d arc: cache - utilize IO coherency (AKA IOC) engine
With release of ARC HS38 v2.1 new IO coherency engine could be built-in
ARC core. This hardware module ensures coherency between DMA-ed data
from peripherals and L2 cache.

With L2 and IOC enabled there's no overhead for L2 cache manual
maintenance which results in significantly improved IO bandwidth.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2016-02-20 11:20:05 +03:00
Alexey Brodkin
379b3280b3 arc: cache - accommodate different L1 cache line lengths
ARC core could be configured with different L1 and L2 (AKA SLC) cache
line lengths. At least these values are possible and were really used:
32, 64 or 128 bytes.

Current implementation requires cache line to be selected upon U-Boot
configuration and then it will only work on matching hardware. Indeed
this is quite efficient because cache line length gets hardcoded during
code compilation. But OTOH it makes binary less portable.

With this commit we allow U-Boot to determine real L1 cache line length
early in runtime and use this value later on. This extends portability
of U-Boot binary a lot.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2016-02-20 11:19:53 +03:00
Albert ARIBAUD
ecc306639e Fix board init code to respect the C runtime environment
board_init_f_mem() alters the C runtime environment's
stack it is actually already using. This is not a valid
behaviour within a C runtime environment.

Split board_init_f_mem into C functions which do not alter
their own stack and always behave properly with respect to
their C runtime environment.

Signed-off-by: Albert ARIBAUD <albert.u.boot@aribaud.net>
Acked-by: Thomas Chou <thomas@wytron.com.tw>
2016-01-13 21:05:17 -05:00
Alexey Brodkin
ef639e6f70 arc: significant cache rework
[1] Align cache management functions to those in Linux kernel. I.e.:
    a) Use the same functions for all cache ops (D$ Inv/Flush)
    b) Split cache ops in 3 sub-functions: "before", "lineloop" and
"after". That way we may re-use "before" and "after" functions for
region and full cache ops.

 [2] Implement full-functional L2 (SLC) management. Before SLC was
simply disabled early on boot. It's also possible to enable or disable
L2 cache from config utility.

 [3] Disable/enable corresponding caches early on boot. So if U-Boot is
configured to use caches they will be used at all times (this is useful
in partucular for speed-up of relocation).

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-07-01 17:17:27 +03:00
Alexey Brodkin
8b2eb776b1 arc: implement slave cores kick-start for Linux kernel
With new SMP-enabled CPUs with ARC HS38 cores and corresponding support
in Linux kernel it's required to add basic SMP support in U-Boot.

Currently we assume the one and only core starts execution after
power-on. So most of things in U-Boot is handled in UP mode.

But when U-Boot is used for loading and starting Linux kernel right
before jumping to kernel's entry point U-Boot:
 [1] Sets all slave cores to jump to the same address [kernel's entry
point]
 [2] Really starts all slav cores

In ARC's implemetation of SMP in Linux kernel all cores are supposed to
run the same start-up code. But only core with ID 0 (master core)
processes further while others are looping waiting for master core to
complete some initialization.

That means it's safe to un-pause slave cores and let them execute kernel
- they will wait for master anyway.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
2015-07-01 17:17:27 +03:00
Alexey Brodkin
a811492e4f arc: fix separate compilation of start.o
While testing "arc: make sure _start is in the beginning of .text
section" I haven't done proper clean-up of built binaries and so missed
another tiny bit that lead to the following error:
 --->8---
    LD      u-boot
 arc-linux-ld.bfd: cannot find arch/arc/lib/start.o
 Makefile:1107: recipe for target 'u-boot' failed
 make: *** [u-boot] Error 1
 --->8---

Fix is trivial: put "start.o" in "extra-y".

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-04-10 19:22:40 +03:00
Alexey Brodkin
89576072cb arc: make sure _start is in the beginning of .text section
This is important to have entry point in the beginning of .text section
because it allows simple loading and execution of U-Boot.

For example pre-bootloader loads U-Boot in memory starting from offset
0x81000000 and then just jumps to the same address.

Otherwise pre-bootloader would need to find-out where entry-point is. In
its turn if it deals with binary image of U-Boot there's no way for
pre-bootloader to get required value.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-04-10 18:45:34 +03:00
Alexey Brodkin
6eb15e50f4 arc: add support for SLC (System Level Cache, AKA L2-cache)
ARCv2 cores may have built-in SLC (System Level Cache, AKA L2-cache).
This change adds functions required for controlling SLC:
 * slc_enable/disable
 * slc_flush/invalidate

For now we just disable SLC to escape DMA coherency issues until either:
 * SLC flush/invalidate is supported in DMA APIin U-Boot
 * hardware DMA coherency is implemented (that might be board specific
   so probably we'll need to have a separate Kconfig option for
   controlling SLC explicitly)

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-04-03 09:47:50 +03:00
Alexey Brodkin
f56d625ee0 arc: get rid of CONFIG_SYS_GENERIC_GLOBAL_DATA
As discussed on mailing list we're drifting away from
CONFIG_SYS_GENERIC_GLOBAL_DATA in favour to use of board_init_f_mem()
for global data.

So do this for ARC architecture.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-04-03 09:47:49 +03:00
Alexey Brodkin
3fb8016360 arc: clean-up init procedure
Intention behind this work was elimination of as much assembly-written
code as it is possible.

In case of ARC we already have relocation fix-up implemented in C so why
don't we use C for U-Boot copying, .bss zeroing etc.

It turned out x86 uses pretty similar approach so we re-used parts of
code in "board_f.c" initially implemented for x86.

Now assembly usage during init is limited to stack- and frame-pointer
setup before and after relocation.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Simon Glass <sjg@chromium.org>
2015-04-03 09:47:49 +03:00
Alexey Brodkin
8ee28251d9 arc: move low-level interrupt and exception handlers in a separate file
This separation makes maintenance of code easier because those low-level
interrupt- or exception handling routines are pretty static and usually
require not much care while start-up code is a subject of modifications
and enhancements.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-04-03 09:47:49 +03:00
Alexey Brodkin
4d93617d87 arc: merge common start-up code between ARC and ARCv2
Even though ARCompact and ARCv2 are not binary compatible most of
assembly instructions are used in both. With this change we'll get rid
of duplicate code.

Still IVTs are implemented differently so we're keeping them in separate
files.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-04-03 09:47:49 +03:00
Alexey Brodkin
ae4a351ad9 arc: cache - build invalidate_icache_all() and invalidate_dcache_all()
always

Make both invalidate_icache_all() and invalidate_dcache_all() available
even if U-Boot is configured with CONFIG_SYS_DCACHE_OFF and/or
CONFIG_SYS_ICACHE_OFF.

This is useful because configuration of U-Boot may not match actual
hardware features. Real board may have cache(s) but for some reason we
may want to run U-Boot with cache(s) disabled (for example if some
peripherals work improperly with existing drivers if data cache is
enabled). So board may start with cache(s) enabled (that's the case for
ARC cores with built-in caches) but early in U-Boot we disable cache(s)
and make sure all contents of data cache gets flushed in RAM.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-04-03 09:47:49 +03:00
Alexey Brodkin
a67ef280f4 arc: build libgcc in U-Boot
This way we may have very limited set of functions implemented so we
save some space.

Also it allows us to build U-Boot for any ARC core with the same one
toolchain because we don't rely on pre-built libgcc.

For example:
 * we may use little-endian toolchain but build U-Boot for ether
endianess
 * we may use non-multilibbed uClibc toolchain but build U-Boot for
whatever ARC CPU flavour that current GCC supports

Private libgcc built from generic C implementation contributes only 144
bytes to .text section so we don't see significant degradation of size:
--->8---
$ arc-linux-size u-boot.libgcc-prebuilt
   text	   data	    bss	    dec	    hex	filename
 222217	  24912	 214820	 461949	  70c7d	u-boot.libgcc-prebuilt

$ arc-linux-size u-boot.libgcc-private
   text	   data	    bss	    dec	    hex	filename
 222361	  24912	 214820	 462093	  70d0d	u-boot.libgcc-private
--->8---

Also I don't notice visible performance degradation compared to
pre-built libgcc (where at least "*div*" functions are had-written in
assembly) on typical operations of downloading 10Mb uImage over TFTP and
bootm.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-02-09 16:41:20 +03:00
Alexey Brodkin
205e7a7b77 arc: select cache settings via menuconfig
This change allows to keep board description clean and minimalistic.
This is especially helpful if one board may house different CPUs with
different features.

It is applicable to both FPGA-based boards or those that have CPUs
mounted on interchnagable daughter-boards.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-02-09 16:41:20 +03:00
Alexey Brodkin
5ff40f3d42 arc: define and use PTAG AUX regs for MMUv3 only
DC_PTAG and IC_PTAG registers only exist in MMUv3.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-02-09 16:41:20 +03:00
Igor Guryanov
f958a91fa5 arc: memcmp - fix zero-delay loop utilization
It's prohibited to put branch instruction in the very end of zero-delay
loop. On execution this causes "Illegal instruction" exception.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Igor Guryanov <guryanov@synopsys.com>
2015-02-09 16:41:20 +03:00
Alexey Brodkin
660d5f0d49 arc: move common sources in library
"reset.c" and "cpu.c" have no architecture-specific code at all.
Others are applicable to either ARC CPU.

This change is a preparation to submission of ARCv2 architecture port.

Even though ARCv1 and ARCv2 ISAs are not binary compatible most of
built-in modules still have the same programming model - AUX registers
are mapped in the same addresses and hold the same data (new featues
extend existing ones).

So only low-level assembly code (start-up, interrupt handlers) is left
as CPU(actually ISA)-specific. This significantyl simplifies maintenance
of multiple CPUs/ISAs.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Igor Guryanov <guryanov@synopsys.com>
2015-01-15 22:40:49 +03:00
Alexey Brodkin
1c91a3d979 arc: relocate - minor refactoring and clean-up
* use better symbols for relocatable region boundaries
("__image_copy_start" instead of "CONFIG_SYS_TEXT_BASE")
 * remove useless debug messages because they will only show up in case
of both problem (when normal "if" branch won't be taken) and DEBUG take
place which is pretty rare situation.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
2015-01-15 22:38:42 +03:00
Igor Guryanov
20a58ac0d8 arc: introduce separate section for interrupt vector table
Even though existing implementation works fine in preparation to
submission of ARCv2 architecture we need this change.

In case of ARCv2 interrupt vector table consists of just addresses
of corresponding handlers. And if those addresses will be in .text
section then assembler will encode them as everything in .text section
as middle-endian and then on real execution CPU will read swapped
addresses and will jump into the wild.

Once introduced new section is situated so .text section remains the
first which allows us to use common linker option for linking everything
to a specified CONFIG_SYS_TEXT_BASE.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Igor Guryanov <guryanov@synopsys.com>
2015-01-15 22:38:42 +03:00
Alexey Brodkin
36ae5cd2a8 arc: fix relocation for big-endian target
In case of little-endian ARC700 instructions (which may include target
address) are encoded as middle-endian. That's why it's required to swap
bytes after read and ten right before write back.

But in case of big-endian ARC700 instructions are encoded as a plain
big-endian. Thus no need for byte swapping.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>

Cc: Francois Bedard <fbedard@synopsys.com>
Cc: Tom Rini <trini@ti.com>
cc: Noam Camus <noamc@ezchip.com>
2014-02-21 07:56:42 -05:00
Alexey Brodkin
2272382879 arc: add library functions
These are library functions used by ARC700 architecture.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>

Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Francois Bedard <fbedard@synopsys.com>
Cc: Wolfgang Denk <wd@denx.de>
Cc: Heiko Schocher <hs@denx.de>
2014-02-07 08:14:32 -05:00