Retain a stringbuilder to mutate the string rather than finalizing temporary strings
yields some speed improvements (less gen0 string objects allocated)
I think this was a PKHeX issue that went unnoticed; originally, we didn't include the Á and Í chars in the dictionary.
I checked the transporter code:
The app maintains the international & japanese character tables, and depending on the ROM language, it may change a char to the language-specific entry. Refer to Bulbapedia's notes on the char tables for different languages:
https://bulbapedia.bulbagarden.net/wiki/Character_encoding_in_Generation_II
However, none of these char-changes are able to be reached with a legal char.
Á and Í (only accessible from the Spanish in-game trades) and the german 0xC0 && z <= 0xC6 chars are already in the international table. Every single difference in the VC1/VC2 table is an un-enterable char.
tl;dr -- all possible char codes are transferred fine with the VC2 table without extra language logic. We just keep out any inaccessible char (replaced with space).
Introducing a new PKM format: SK2
Split ICaughtData2 off of PK2 so it can be shared with SK2 when type-checks occur
Add conversion for PK2<->SK2
Split the split-buffer handling for GBPKM to GBPKML (what a name), so that I can reuse shared accessor logic for SK2.
Results in a little more code, but each path is less tangled
simplify some expressions
remove RBDragonair content in favor of a strict filter for catch rate
Move gen1 trade trainer names to stringconverter, since pk1/pk2 shouldn't refer to legality classes
Removes empty trashbyte array allocation (less objects)
Change int[] to byte[] (less filesize/mem) (-256*6)
Change int[] to ushort[], precompute reverse table in saved space
removes dictionary lookup for array index fetch (faster, less memory, no
temp obj allocations!)
could make the ushort[] arrays into byte[] by changing them to be value
shifts? Not worth saving filesize for cpu.
reduces loading time (don't have to allocate conversion arrays when
launching a gen7 game), and separates things to easier to manage
locations
reworks gen3 string encode/decode, no longer does 3->4->5 and 5->4->3;
instead goes straight to the end result without an intermediary format.
String sanitization should probably be broken up rather than reused, oh
well.