SPC2 File Format Spec
---------------------

V1.1
1/22/12
K.Horton
----


Problem
-------

After implementing an SPC player in an FPGA, I needed some way to pack the SPC's into a format that was usable by my FPGA and real live SNES systems.  The current .RSN format unfortunately does not allow this, because RAR uncompression is practically impossible on the SNES or an FPGA.

Background
----------

Each .RSN file contains a collection of SPC's which are simply 64K memory dumps of the SPC memory, along with a header and footer which contains the song name, length, etc. in RAR format.  In fact, the .RSN file can simply be renamed to .RAR and the SPCs inside can be extracted.

RAR was traditionally used because it did solid archiving, which saved a large amount of space, due to many SPCs in a set having redundant data (i.e. player code, samples, etc).


Solution
--------

I decided that a decent approach for compressing the data in a form easy to use by the SNES, FPGA, or other hardware player was to break each SPC in the set up into 256 byte blocks, then remove duplicate blocks, and then store the unique blocks and a map file for each SPC that maps which blocks are assembled into a final 64K block of data for the desired SPC.

I have settled on the .SP2 extension for these files.

I tried several block sizes on a set of 30 different RSN sets, and 256 byte blocks appear to be the best ratio for size reduction and efficiency.  Smaller block sizes resulted in a larger map which offset any potential space savings.  Anything bigger was too coarse and block redundancy was less resulting in more space taken up.


The solution I came up with:

The overall file format is composed of a header, then 1K blocks of data for each SPC represented which holds some basic data (DSP registers, IPL ROM, register values, etc) and then the 512 byte map for the 64K of RAM.

After the 1K SPC data blocks, follows the RAM data blocks, which has had all redundant blocks removed, then finally any trailing metadata. (more about this later)

All multibyte values are little endian.

file offset,  length,  type,  info           description
--------------------------------------------------------
0             5        byte    "KSPC",1ah    a 4 byte header ID and an EOF char.
5             1        byte    01h           major revision (currently 1)
6             1        byte    00h           minor revision (currently 0)
7             2        uint16                number of SPC's in this collection
9             7        byte    00h*7         expansion. must be zeros
16            1024     bytes                 first SPC data block
1040          1024     bytes                 second SPC data block
....
16+(1024*N)   256*M    bytes RAM data blocks N = number of SPC blocks, M = number of RAM data blocks
16+(1024*N)+(256*M)    bytes                 extended metadata (see below)
EOF


Each SPC data block has this format:

offset        length   type    info          description
--------------------------------------------------------
0             512      uint16*256            Offsets for all 256, 256 byte blocks of data in RAM
512           128      bytes                 DSP register data (as-is from .SPC)
640           64       bytes                 IPL ROM (as-is from .SPC)
704           7        bytes                 PCL, PCH, A, X, Y, PSW, SP (as-is from .SPC)
711           1        byte                  channel enable bits, 1 per chan (0 = en, 1 = dis) (as-is)
712           4        bytes                 date 
716           4        uint32                1/64000th's to play before fadeout
720           4        uint32                1/64000th's to fade to silence
724           4        uint32                amplification value (10000h == 1.00)
728           1        byte                  emulator used
729           1        byte                  OST disk
730           2        byte                  OST track#
732           2        uint16                copyright year (stored in binary format)
734           34       bytes                 zero bytes. all should be 000h
768           32       string                song name
800           32       string                game name
832           32       string                artist name
864           32       string                dumper name
896           32       string                comments
928           32       string                official soundtrack title
960           32       string                publisher name
992           28       string                original .spc filename, WITHOUT the .spc extension
1020          4        uint32                pointer to extended data block, if needed


In the above data block, all of the data from byte 512 onward come from the individual SPC files.

RAM block offsets:
------------------

This is a list of all 256 offsets into the RAM data blocks used to assemble our SPC RAM.  the first offset represents bytes 0000-00ffh of SPC RAM,  the second offset is 0100-01ffh, etc.

The offset directly maps to one of the 256 byte blocks of data in our RAM data block.  The first 256 bytes in the RAM data block would be block 0,  the next 256 bytes would be block 1, etc.   The 16 bit block pointer allows for up to 65536 256 byte blocks, which is 16Mbytes of space.  

DSP reg data, IPL ROM, and PCL, PCH, etc:
-----------------------------------------

In the .SPC file, the data is located in the following areas:

(hex)
10100-1017f  128      bytes    DSP register data
101c0-101ff  64       bytes    IPL ROM
00025-0002b  7        bytes    PCL, PCH, A, X, Y, PSW, SP

This data can simply be read from the SPC and stored in the info block as-is.

The rest of the information needs to be extracted a bit.   Refer to the SPC file format specification, located here for this section:

http://vspcplay.raphnet.net/spc_file_format.txt

There are three ways the metadata can be stored:    ASCII format in the header,  binary format in the header, or as an extended "ID666" tag format.

The easiest way probably is to snag some SPC player source to read all the fields in and process them... but if not here is what I do know.

The ASCII format should be discernable by checking out the date, play time and fadeout fields.  It will have ASCII 0-9 (30-39h) in there for ASCII format.  Beware that some bytes of those fields might be zero padded.

I am not sure how else you can discern the two formats; there is no flag or version number I can see.

Moving on...

Date:
-----
stored as four bytes, in mm/dd/yyyy as BCD, like so:

01h, 12h, 20h, 00h would be 01/12/2000
07h, 30h, 19h, 95h would be 07/30/1995

Song length + Fadeout length:
-----------------------------

Song length and fadeout are both stored in 1/64000th's of a second.  This is 2x the length of a sample period.


For the length, if the standard header is used, is seconds*64000.
For the fadeout, if the standard header is used, is milliseconds*64.

If the ID666 header is used, it's a small bit more complicated but not too much.
length is introduction length + (loop length * number of loops) + end length (if positive).
fadeout is simply the fadeout value as it is.

The ID666 tag information should probably be tested for sanity. 


Amplification value:
--------------------

If there is no ID666 tag, or the amplification value is missing from it, default to 10000h (65536).  This indicates an amplification factor of 1.00.  Otherwise, just copy the ID666 value over into this field as it is.

Emulator used, OST disk, OST track:
-----------------------------------

These are copied as-is from the header/ID666 tag.  If a field is not used, fill it with zeros.

Copyright year:
---------------

Convert the year to a uint16 and store it here.  If no copyright year is given, store zero.

Song name, game name, artist name, dumper name, comments, official sound track title, publisher:
------------------------------------------------------------------------------------------------

These are simply 32 byte fields containing the appropriate information.  If a field is not used, fill it with zeros.  

Store the text starting from the first byte, then fill the remainder of the field with zeros.  If all 32 bytes are used, then no zeros are needed.  (i.e. the fields do not need to be null terminated)  If the field's data is longer than 32 bytes, it is extended upon by putting the "overflow" in the extended data block area.

i.e. if the song title is 40 characters long, the first 32 are placed in the song name field, and the remaining 8 are placed in an extended data block (see below).

Original SPC name:
------------------

This field holds the original filename of the SPC file used to generate this data block.  The .SPC extention is NOT INCLUDED since it is always going to be .SPC.  Like the song name, game name, etc. fields, if the filename is longer than 28 characters, it is extended upon by using the extended data block area, just like the other fields.

Pointer to extended data block:
-------------------------------

If one of the text fields is not long enough, this pointer is used to refer to a block of metadata at the end of the file, after the RAM data blocks. This offset is simply a file offset from the first byte of the file (i.e. actual file offset, and not a relative offset).

These fields are documented later.


-----

The RAM data block and .SP2 file generation
-------------------------------------------

This section of the file is simply 256 byte blocks of data used to build the SPC RAM up into memory.

It is generated during the process of making our .SP2 file. 

The file is generated like so:

* Emit the header (KSPC, 1ah, etc) and number of SPCs in the collection.

* Each of the SPCs in the .RSN collection is read one at a time, 256 bytes at a time.

* Initialize our collection of RAM blocks. (it should be empty at this point)

* If the 256 byte SPC block does not exist in our collection, add it to the collection and emit its block number to the output file.
* If the 256 byte SPC block DOES exist in our collection, emit its block number to the output file.

* After processing all 64K of the first SPC, process its metadata to finish off its 1K data block. if extended metadata is needed (i.e. text field is greater than 32 bytes), it can be processed now or a flag set to process it later. 

* Repeat this process for each SPC in our collection.

* Emit the saved collection of RAM blocks we made while processing the individual SPC's.

* Process and emit any extended metadata, if used.

* Close the file.


Extended metadata
-----------------

If an SPC has too much text to fit in the fields provided, the 'overflow' can be stored in these metadata blocks.  I.e. if the game name is longer than 32 characters,  the first 32 chars are stored in the game name field in the SPC's regular block, and then any characters past 32 are stored in a metadata block.  

i.e. if the game name is "This is a long game name, it is more than 32 characters long", it cannot be stored in the 32 byte game name field alone.  This particular string is 60 characters so it would be stored like so:

<SPC data block, offset 800>:
"This is a long game name, it is "  (first 32 characters)

Then there is a pointer to an extended data block, which contains the following data:

002h,01ch,"more than 32 characters long",000h,000h
^--  chunk type + length   ^-- data      ^-- end marker indicating there is no more extended data

The format of the data is a chunk based method, using the following form:

offset  length  type   description
0       1       byte   chunk type (see below)
1       1       byte   length, in bytes
2       N       string the text string

Each chunk follows this same form.  

Chunk types:

0  - EOD  this chunk type indicates the end of our extended metadata.  Its length is always 0 bytes.
1  - Song name
2  - Game name
3  - artist name
4  - dumper name
5  - comments
6  - official soundtrack title
7  - publisher's name
8  - original SPC filename

All other types are not defined.

Notes:

* If a field is longer than 287 characters (32 char field + 255 max length chunk), more chunks of the same type can follow.  i.e. it is perfectly legal to have two or more "song name" chunks, if needed.  Please do not abuse this.

* If the extended metadata for 2 or more SPC's are the same, then both SPC information blocks can point to the same extended metadata block.

* Multiple extended metadata blocks for different SPC's are simply appended one after the other and pointed to by the requisite SPC data blocks.




Changelog:
----------

1/18/2012: 

* Original version

1/22/2012:

* Added original SPC name field
* Moved extended metadata chunk pointer to end of block
* Clarified how the extended metadata is to be used



